Abstract
On the one hand, pesticides may be absorbed into the body orally, dermally, ocularly and by inhalation and the human exposure may be dietary, recreational and/or occupational where toxicity could be acute or chronic. On the other hand, the environmental fate and toxicity of the pesticide is contingent on the physico-chemical characteristics of pesticide, the soil composition and adsorption. Human toxicity is also dependent on the exposure time and individual’s susceptibility. Therefore, this work will focus on the development of an Artificial Intelligence based diagnosis support system to assess the pesticide toxicological risk to humanoid, built under a formal framework based on Logic Programming to knowledge representation and reasoning, complemented with an approach to computing grounded on Artificial Neural Networks. The proposed solution is unique in itself, once it caters for the explicit treatment of incomplete, unknown, or even self-contradictory information, either in terms of a qualitative or quantitative setting.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- Pesticide exposure
- Toxicity
- Environmental fate
- Artificial intelligence
- Logic programming
- Knowledge representation and reasoning
- Artificial neuronal networks
- Incomplete information
1 Introduction
Pesticides are extensively used in agriculture aiming at the control of weeds or plant diseases, which may remain in residual amounts in fruits, vegetables, grains, and water, just to name a few. They stand for xenobiotic compounds for living organisms, and their toxicity is not due to a single molecular event or interaction, but rather to a set of occurrences, starting with pesticide exposure and reaching a point of highest development with the expression of one or more toxic endpoints. These happenings include adsorption, distribution, biotransformation, distribution of metabolites, interaction with cellular macromolecules and excretion [1]. Biotransformation may result in the formation of less toxic and/or more toxic metabolites, while the various other processes determine the balance between toxic and a nontoxic upcoming [2]. Pesticides can be absorbed by oral, dermal, nasal and/or ocular exposure. Human exposure can be dietary recreational and/or occupational, and toxicity could be acute or chronic. Aggregate exposure and risk assessment involve multiple pathways and routes, including the potential for pesticide residues in food and drinking water, as well as in residues from pesticide use in residential and non-occupational environments [3]. To ensure the safety of the food supply for human consumption, Maximum Contaminant Levels (MCLs) sets the legal limits for the amount of pesticides allowed in food and drinking water. This is related with Acceptable Daily Intake (ADI), defined as the amount of a chemical that can be consumed safely every day [4].
Pesticide environmental fate and toxicity depends on the physical and chemical characteristics of pesticide, the soil composition, soil adsorption, and pesticide residues found in different soil compartments. The human hazard is determined by the pesticide properties, exposure time and the individual’s susceptibility, affecting the magnitude of these processes and the final fate and toxicity of pesticide [5]. Indeed, agricultural pesticides are incorporated into the organism by different routes which can be stored and distributed in different tissues, leading to an internal concentration that can induce alterations, adverse effects and/or diseases. Often, the human exposure to pesticides was evaluated only by human biomonitoring, i.e., measuring levels in matrixes such as blood and urine. However, the concentration measured might not relate to toxic effect [5, 6].
Recent works established pesticide impact and toxicity based on chemical properties, environmental fate and exposure considerations [7–9]. However, those methodologies for problem solving are not able to deal with incomplete data, information or knowledge. Indeed, for the development of intelligent decision support systems aimed at integrated pesticide toxicological risk assessment, it is necessary to consider different conditions with intricate relations among them. Thus, the present work reports the founding of a computational framework that uses knowledge representation and reasoning techniques to set the structure of the information and the associate inference mechanisms, i.e., it will be centered on a Proof Theoretical approach to Logic Programming (LP) [10], complemented with a computational framework based on Artificial Neural Networks (ANNs) [11].
2 Knowledge Representation and Reasoning
Many approaches to knowledge representation and reasoning have been proposed using the Logic Programming (LP) epitome, namely in the area of Model Theory [12, 13], and Proof Theory [10, 14]. In the present work the Proof Theoretical approach in terms of an extension to the LP language is followed. An Extended Logic Program is a finite set of clauses, given in the form:
where “?” is a domain atom denoting falsity, the p i , q j , and p are classical ground literals, i.e., either positive atoms or atoms preceded by the classical negation sign \( \neg \) [10], that stands for a strong declaration that speaks for itself, and not denotes negation-by- failure, or in other words, a flop in proving a given statement, once it was not declared explicitly. Under this formalism, every program is associated with a set of abducibles [12, 13], given here in the form of exceptions to the extensions of the predicates that make the program, i.e., clauses of the form:
that stand for information or knowledge that cannot be ruled out. On the other hand, clauses of the type:
also named invariants or restrictions to complain with the universe of discourse, set the context under which it may be understood. The term scoring value stands for the relative weight of the extension of a specific predicate with respect to the extensions of peers ones that make the inclusive or global program.
In order to evaluate the knowledge that may be associated to a logic program, an assessment of the Quality-of-Information (QoI), given by a truth-value in the interval \( 0, \ldots ,1 \), that branches from the extensions of the predicates that make a program, inclusive in dynamic environments, is set [15, 16]. On the other hand, a measure of one’s confidence that the argument values or attributes of the terms that make the extension of a given predicate, with relation to their domains, fit into a given interval, is also considered, and labeled as Degree of Confidence (DoC) [17]. The DoC is evaluated as described in [17] and computed using \( DoC = \sqrt {1 - \Delta l^{2} } \), where \( \Delta l \) stands for the argument interval length, which was set to the interval 0, …, 1. Thus, the universe of discourse is engendered according to the information presented in the extensions of such predicates, according to productions of the type:
where ∪ and m stand, respectively, for set union and the cardinality of the extension of predicate i . QoI i and DoC i stand for themselves.
3 Case Study
In order to develop a predictive model to assess the pesticides toxicological risk a knowledge database was set, and built around the pesticides records of the National Pesticide Information Center [18]. For each pesticide it was considered information regarding environmental fate, human exposure and toxicity (i.e., acute and chronic) both in qualitative and quantitative terms. This section demonstrates how the information comes together and how it is processed.
3.1 Qualitative Data Pre-processing
Aiming at the quantification of the qualitative information and in order to make easy the understanding of the process, it was decided to put it in a graphical form. Taking as an example a set of 3 (three) issues regarding a particular subject (where the possible alternatives are none, low, moderate, high and very high), a unitary radius circle split into 3 (three) slices is itemized (Fig. 1). The marks in the axis correspond to each of the possible choices. If the answer to issue 1 is high the area correspondent is \( \pi \times 0.75^{2} /3 \), i.e., \( 0.19 \pi \) (Fig. 1(a)). Assuming that in the issue 2 are chosen the alternatives high and very high, the correspondent area ranges in the interval \( \pi \times 0.75^{2} /3 \cdots \pi \times 1^{2} /3 \), i.e., \( 0.19 \pi \cdots 0.33\pi \) (Fig. 1(b)). Finally, in issue 3 if no alternative is ticked, all the hypotheses should be considered and the area varies in the interval \( 0 \cdots \pi \times 1^{2} /3 \), i.e., \( 0 \cdots 0.33\pi \) (Fig. 1(c)). The total area is the sum of the partial ones and is set in the interval \( 0.38 \pi \cdots 0.85\pi \) (Fig. 1(d)). The normalized area is the ratio between the area of the figure and the area of the unitary radius circle. Thus, the quantitative value regarding the subject in analysis is set to the interval \( 0.38 \cdots 0.85 \).
3.2 A Logic Programming Approach to Data Processing
It is now possible to build up a knowledge database given in terms of the extensions of the relations (or tables) depicted in Fig. 2, which denote a situation where one has to manage information in order to evaluate the Pesticide Toxicological Risk. Under this scenario some incomplete and/or default data is present. For instance, in the former case the ADI is unknown (depicted by the symbol ⊥), while the Acute Toxicity for Mice/Rats is not conclusive (Slightly/Moderate).
The Human Exposure table is populated with 0 (zero) that stands for absence, 1 (one) that denotes food or drinking water only (in dietary column), and dermal or inhalation exposure only (in occupational column), and 2 (two) stand for simultaneous exposition. The issues presented in Environmental Fate table are populated with absence, low, medium, high and very high, while the columns of Acute and Chronic Toxicity tables are filled with absence, slightly, medium, high and very high. In order to quantify the information present in these tables the procedures already described above were followed.
Applying the algorithm presented in [17] to the table or relation’s fields that make the knowledge base for Pesticide Toxicological Risk Assessment (Fig. 2), and looking to the DoCs values obtained as described in [17], it is possible to set the arguments of the predicate t oxicological r isk a ssessment (tra) referred to below, whose extensions denote the objective function with respect to the problem under analyze:
where 0 (zero) and 1 (one) denote, respectively, the truth values false and true.
The algorithm presented in [17] encompasses different phases. In the former one the clauses or terms that make extension of the predicate under study are established. In the subsequent stage the arguments of each clause are set as continuous intervals. In a third step the boundaries of the attributes intervals are set in the interval [0, 1] according to a normalization process given by the expression \( (Y - Y_{min} )/(Y_{max} - Y_{min} ) \), where the Y s stand for themselves. Finally, the DoC is evaluated as described in Sect. 2.
Exemplifying the application of the algorithm presented in [17], in relation to the term (clause) that presents the feature vector ADI = 0.01, MCL = ⊥, EF = 0.28, AT = [0.64, 0.81], CT = [0.04, 0.06], DHE = 1, OE = 2, one may have:
4 Artificial Neural Networks
On the one hand, ANNs denote a set of connectionist models inspired in the behaviour of the human brain. In particular, the MultiLayer Perceptron (MLP) model stands for the most popular ANN architecture, where neurons are grouped in layers and only forward connections are set [19]. This provides a powerful base-learner with some advantages with respect to other approaches (e.g., adaptability, robustness, flexibility, nonlinear mapping and noise tolerance), a reason why they are increasingly used in data mining, namely due to its good behaviour in terms of predictive knowledge [20]. The interest in MLPs was stimulated by the advent of the Backpropagation algorithm in 1986, and since then several fast gradient based variants have been proposed (e.g., RPROP) [21]. Yet, these training algorithms minimize an error function by tuning the modifiable parameters of a fixed architecture, which needs to be set a priori. The MLP performance will be sensitive to this choice, i.e., a small network will provide limited learning capabilities, while a large one will induce generalization loss (i.e., over fitting). MLP is molded on three or more layers of artificial neurons, including an input layer, an output layer and a number of hidden layers with a certain number of active neurons. In addition, there is also a bias, which is only connected to neurons in the hidden and output layers [19]. The correct design of the MLP topology is a complex and crucial task, commonly addressed by trial-and-error procedures (e.g., exploring different number of hidden nodes), in a blind search strategy, which only goes through a small set of possible configurations. More elaborated methods have also been proposed, such as pruning [22] and constructive [23] algorithms, although these perform hill-climbing and are thus prone to local minima [11]. The number of nodes in the input layer sets the number of independent variables, and the number of nodes in the output layer denotes the number of dependent ones [19].
On the other hand, the framework presented previously shows how the information comes together and how it is processed. In this section, a data mining approach to deal with the processed information is considered. A hybrid computing approach was set to model the universe of discourse, where the computational part is based on ANNs, whose behavior was referred to above, and used not only to structure data but also to capture the problem(s) objective function’s nature (i.e., the relationships between inputs and outputs) [24, 25].
Figure 3 shows a case being submitted to the Pesticide Toxicological Risk Assessment model. The normalized values of the interval boundaries and its QoI’s and DoC’s stand for the inputs to the ANN. The output is given in terms of Pesticide Toxicological Risk evaluation and the degree of confidence that one has on such a happening. In this study 142 pesticides were considered (i.e., one hundred and forty two terms or clauses of the extension of predicate tra). To implement the evaluation mechanisms and to test the model, ten folds cross validation were applied [19]. The back propagation algorithm was used in the learning process of the MLP. As the output function in the pre-processing layer it was used the identity one, while in the other layers we considered the sigmoid.
A common tool to evaluate the results presented by the classification models is the coincidence matrix, a matrix of size L × L, where L denotes the number of possible classes. This matrix is created by matching the predicted and target values. L was set to 2 (two) in the present case. Table 1 presents the coincidence matrix of the ANN model, where the values presented denote the average of 25 (twenty five) experiments. A glance at Table 1 shows that the model accuracy was 93.7 % (133 instances correctly classified in 142). Therefore, the predictions made by the ANN model are satisfactory, attaining accuracies higher than 90 %.
Based on coincidence matrix it is possible to compute sensitivity, specificity, Positive Predictive Value (PPV) and Negative Predictive Value (NPV) of the classifier. Briefly, sensitivity evaluates the proportion of true positives that are correctly identified as such, while specificity translates the proportion of true negatives that are correctly identified. PPV stands for the proportion of cases with positive results which are correctly classified while NPV is the proportion of cases with negative results which are successfully labeled. The values obtained for sensitivity, specificity, PPV and NPV were 94.7 %, 91.5 %, 95.7 % and 89.6 %, respectively. On the one hand, the proposed model correctly identified 94.7 % of the positive cases, i.e., pesticides with potential toxicological risk. On the other hand, it also classified appropriately 91.5 % of the negative cases, i.e., pesticides with low toxicological risk.
The present model, beyond to consider the pesticide chemical properties, enables the integration of Acute and Chronic Toxicity data with other factors such as Environmental Fate and Human Exposure, being therefore assertive in the prediction of Pesticide Toxicological Risk. Thus, it is our claim that the proposed model is able to evaluate the Toxicological Risk properly and can be a major contribution to achieve high levels regarding public health protection and environmental sustainability.
The LP approach to data processing presented in this work is a generic one, and therefore may be applied in different grounds. Indeed, some interesting results have been obtained, namely in the fields of Education [26, 27], pharmacological properties of Essential Oils evaluation [28, 29], and Health [30, 31].
5 Conclusions
The proposed approach is able to give an adequate response to the need for a good response to predict the toxicological risk of pesticide exposure. Nevertheless, it can be considered a hard task since it is necessary to consider different variables and/or conditions with complex relations entwined among them, where the data may be incomplete, self-contradictory, and even unknown. In order to overcome these difficulties this work presents the founding of a hybrid computing approach that uses a powerful knowledge representation and reasoning mechanism to set the structure of the information, complemented with a computational framework based on ANNs, which have been selected due to their proper dynamics, like adaptability, robustness, and flexibility. This approach not only allows evaluating the pesticide toxicological risk, but it also permits the estimation of the degree-of-confidence that one has on such a happening. In fact, this is one of the added values of this approach that arises from the complementarily between Logic Programming (for knowledge representation and reasoning) and the computing process based on ANNs. The present model is a generic one, susceptive of application in different arenas. A possible limitation on its use is not on the model in itself, but on the unavailability of data, information or knowledge; but, even in these situations, once it has the capacity to handle incomplete data, information or knowledge, either in its qualitative or quantitative form, its usefulness is assured. Future developments of the model should include the biotransformation pathways and routes of exposure, and consider the contact time and the individual’s susceptibility. Furthermore, this problem might be approached using others computational frameworks like Case Based Reasoning [27], Genetic Programming [14], or Particle Swarm [32], just to name a few.
References
Hodgson, E.: Introduction to pesticide biotransformation and disposition. In: Hodgson, E. (ed.) Pesticide Biotransformation and Disposition, pp. 1–3. Elsevier, Amsterdam (2012)
Needham, L.L., Patterson, D.G., Barr, D.B., Grainger, J., Calafat, A.M.: Uses of speciation techniques in biomonitoring for assessing human exposure to organic environmental chemicals. Anal. Bioanal. Chem. 381, 397–404 (2005)
Environmental Protection Agency: General Principles for Performing Aggregate Exposure and Risk Assessments. Item 6043. https://www.epa.gov/sites/production/files/2015-07/documents/aggregate.pdf
Renwick, A.G.: Pesticide residue analysis and its relationship to hazard characterisation (ADI/ARfD) and intake estimations (NEDI/NESTI). Pest Manag. Sci. 58, 1073–1082 (2002)
Esteban, M., Castaño, A.: Non-invasive matrices in human biomonitoring: a review. Environ. Int. 35, 438–449 (2009)
Angerer, J., Ewers, U., Wilhelm, M.: Human biomonitoring: state of the art. Int. J. Hyg. Environ. Health 210, 201–228 (2007)
Antón, A., Castells, F., Montero, J.I., Huijbregts, M.: Comparison of toxicological impacts of integrated and chemical pest management in mediterranean greenhouses. Chemosphere 54, 1225–1235 (2004)
Alister, C., Kogan, M.: ERI: Environmental risk index. A simple proposal to select agrochemicals for agricultural use. Crop Prot. 25, 202–211 (2006)
Juraske, R., Antón, A., Castells, F., Huijbregts, M.A.: PestScreen: a screening approach for scoring and ranking pesticides by their environmental and toxicological concern. Environ. Int. 33, 886–893 (2007)
Neves, J.: A logic interpreter to handle time and negation in logic databases. In: Muller, R., Pottmyer, J. (eds.) Proceedings of the 1984 Annual Conference of the ACM on the 5th Generation Challenge, pp. 50–54. Association for Computing Machinery, New York (1984)
Cortez, P., Rocha, M., Neves, J.: Evolving time series forecasting ARMA models. J. Heuristics 10, 415–429 (2004)
Kakas, A., Kowalski, R., Toni, F.: The role of abduction in logic programming. In: Gabbay, D., Hogger, C., Robinson, I. (eds.) Handbook of Logic in Artificial Intelligence and Logic Programming, vol. 5, pp. 235–324. Oxford University Press, Oxford (1998)
Pereira, L.M., Anh, H.T.: Evolution prospection. In: Nakamatsu, K., Phillips-Wren, G., Jain, L.C., Howlett, R.J. (eds.) New Advances in Intelligent Decision Technologies. SCI, vol. 199, pp. 51–64. Springer, Berlin (2009)
Neves, J., Machado, J., Analide, C., Abelha, A., Brito, L.: The halt condition in genetic programming. In: Neves, J., Santos, M.F., Machado, J.M. (eds.) EPIA 2007. LNCS (LNAI), vol. 4874, pp. 160–169. Springer, Heidelberg (2007)
Lucas, P.: Quality checking of medical guidelines through logical abduction. In: Coenen, F., Preece, A., Mackintosh, A. (eds.) Research and Development in Intelligent Systems XX, pp. 309–321. Springer, London (2003)
Machado, J., Abelha, A., Novais, P., Neves, J., Neves, J.: Quality of service in healthcare units. In: Bertelle, C., Ayesh, A. (eds.) Proceedings of the ESM 2008, pp. 291–298. Eurosis – ETI Publication, Ghent (2008)
Fernandes, F., Vicente, H., Abelha, A., Machado, J., Novais, P., Neves, J.: Artificial neural networks in diabetes control. In: Proceedings of the 2015 Science and Information Conference (SAI 2015), pp. 362–370. IEEE Edition (2015)
National Pesticide Information Center. http://npic.orst.edu/index.html
Haykin, S.: Neural Networks and Learning Machines. Pearson Education, New Jersey (2009)
Mitra, S., Pal, S., Mitra, P.: Data mining in soft computing framework: a survey. IEEE Trans. Neural Netw. 13, 3–14 (2002)
Riedmiller, M.: Advanced supervised learning in multilayer perceptrons—from backpropagation to adaptive learning algorithms. Comput. Stand. Interfaces 16, 265–278 (1994)
Thimm, G., Fiesler, E.: Evaluating pruning methods. In: Proceedings of the International Symposium on Artificial Neural Networks, pp. 20–25. National Chiao-Tung University Edition (1995)
Kwok, T., Yeung, D.: Constructive algorithms for structure learning in feedforward neural networks for regression problems: a survey. IEEE Trans. Neural Netw. 8, 630–645 (1997)
Vicente, H., Couto, C., Machado, J., Abelha, A., Neves, J.: Prediction of water quality parameters in a reservoir using artificial neural networks. Int. J. Des. Nat. Ecodyn. 7, 309–318 (2012)
Vicente, H., Dias, S., Fernandes, A., Abelha, A., Machado, J., Neves, J.: Prediction of the quality of public water supply using artificial neural networks. J. Water Supply: Res. Technol. – AQUA 61, 446–459 (2012)
Figueiredo, M., Neves, J., Vicente, H.: A soft computing approach to quality evaluation of general chemistry learning in higher education. In: Caporuscio, M., De la Prieta, F., Di Mascio, T., Gennari, R., Rodríguez, J.G., Vittorini, P. (eds.) Methodologies and Intelligent Systems for Technology Enhanced Learning. Advances in Intelligent and Soft Computing, vol. 478, pp. 81–89. Springer International Publishing, Cham (2016)
Neves, J., Figueiredo, M., Vicente, L., Vicente, H.: A case based reasoning view of school dropout screening. In: Kim, K.J., Joukov, N. (eds.) Information Science and Applications. LNEE, vol. 376, pp. 953–964. Springer, Singapore (2016)
Neves, J., Martins, M.R., Candeias, F., Arantes, S., Piteira, A., Vicente, H.: An assessment of pharmacological properties of schinus essential oils – a soft computing approach. In: Proceedings 30th European Conference on Modelling and Simulation (ECMS 2016), pp. 107–113. European Council for Modelling and Simulation Edition (2016)
Neves, J., Martins, M.R., Candeias, F., Ferreira, D., Arantes, S., Cruz-Morais, J., Gomes, G., Macedo, J., Abelha, A., Vicente, H.: Logic programming and artificial neural networks in pharmacological screening of schinus essential oils. Int. J. Biol. Biomol. Agric. Food Biotechnol. Eng. 9, 706–711 (2015). World Academy of Science, Engineering and Technology, International Science Index 103
Vilhena, J., Vicente, H., Martins, M.R., Grañeda, J., Caldeira, F., Gusmão, R., Neves, J., Neves, J.: Antiphospholipid syndrome risk evaluation. In: Rocha, Á., Correia, A.M., Adeli, H., Reis, L.P., Teixeira, M.M. (eds.) New Advances in Information Systems and Technologies. Advances in Intelligent Systems and Computing, vol. 444, pp. 157–167. Springer International Publishing, Cham (2016)
Neves, J., Martins, M.R., Vilhena, J., Neves, J., Gomes, S., Abelha, A., Machado, J., Vicente, H.: A soft computing approach to kidney diseases evaluation. J. Med. Syst. 39, 131 (2015). doi:10.1007/s10916-015-0313-4
Mendes, R., Kennedy, J., Neves, J.: Watch thy neighbor or how the swarm can learn from its environment. In: Proceedings of the 2003 IEEE Swarm Intelligence Symposium (SIS 2003), pp. 88–94. IEEE Edition (2003)
Acknowledgments
This work has been supported by COMPETE: POCI-01-0145-FEDER-007043 and FCT – Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2013.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Coelho, C., Martins, M.R., Lima, N., Vicente, H., Neves, J. (2016). An Assessment to Toxicological Risk of Pesticide Exposure. In: Li, H., Nykänen, P., Suomi, R., Wickramasinghe, N., Widén, G., Zhan, M. (eds) Building Sustainable Health Ecosystems. WIS 2016. Communications in Computer and Information Science, vol 636. Springer, Cham. https://doi.org/10.1007/978-3-319-44672-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-44672-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44671-4
Online ISBN: 978-3-319-44672-1
eBook Packages: Computer ScienceComputer Science (R0)