Introduction

Along with the increasing consumption of pharmaceutical products worldwide, the environmental pollution caused by pharmaceutical emerging contaminants (PECs) has become a global public problem attracting more and more concerns. So far, various categories of PECs including antibiotics, β-blockers, analgesics and nonsteroidal anti-inflammatory drugs (NSAIDs), antineoplastic, blood lipid-lowering agents, central nervous system acting drugs, antiviral and antiparasitic drugs, hormones, and their metabolites have been frequently reported to be detected within environmental samples collected from different countries and regions around the world (Cunha et al., 2019; Fonseca et al., 2021; He et al., 2017; Hu et al., 2021; Kar et al., 2018; Li et al., 2020). Following discharge from domestic wastewater, industry, medical institutions, etc., pharmaceuticals are exposed to complex transport and transformation processes in the environment. Owing to the generally inefficient removal for PECs by common wastewater treatment plants, the unimpeded flows and spreading of these emerging contaminants result in their wide and unquantifiable distribution in surface water, ground water, sediments, soil, organisms, and other compartments of the ecosystem (Guedes-Alonso et al., 2020; Kar et al., 2018). As a class of specially designed active compounds with potent biological activities, pharmaceuticals would yield their intrinsic toxicities to non-target organisms that are long-termly exposed to PECs in the environment even at trace or ultra-trace levels (Cunha et al., 2019; He et al., 2017; Jiao et al., 2022; Li et al., 2020; Miller et al., 2018; Wang et al., 2021). Moreover, unpredictable interactions among multitudinous PECs including pharmaceutical parent compounds, metabolites, and other contaminants coexisting in receiving environments aggravate and amplify the environmental risks posed by PECs to the ecosystem (De Vaugelade et al., 2017; Koltsakidou et al., 2019; Ofrydopoulou et al., 2021).

To address the environmental issues of PECs, the European Medicines Agency (EMA) and the United States (US) Food and Drug Administration (FDA) have developed and introduced various environmental risk assessment (ERA) guidelines to monitor the residual levels and evaluate the potential risks of pharmaceuticals in the environment in recent decades (Jose et al., 2020). For example, the ERA as a part of the registration procedure must be in place before approval of a new human pharmaceutical drug in the European Union (EU) and the USA, which is required to be addressed by pharmaceutical company for the drug’s environmental fate and impact (Holm et al., 2013; Jiao et al., 2022; Kar et al., 2018). The risk quotient (RQ) value is commonly applied in the ERA for harmful effects of PECs on ecosystem, which is defined as the ratio of the maximum measured environmental concentration (MEC) to the predicted no effect concentration (PNEC) (Holm et al., 2013;Molnar et al., 2021). The determination of PNEC depends on available toxicological data, for example, ecotoxicological threshold data from experiments on representative aquatic organisms including algae, Cladocera usually Daphnia sp., and/or fish species (Molnar et al., 2021). In general, RQ value < 0.01 denotes a negligible risk, RQ < 0.1 reveals a low risk, 0.1 < RQ < 1 represents a medium risk, and RQ > 1 indicates a high ecological risk to aquatic organisms (Gouveia et al., 2019; Molnar et al., 2021; Nieto-Juarez et al., 2021; Riva et al., 2019). As the simplest ERA method, the RQ-based procedure has been widely adopted to determine whether chemicals such as PECs in the environment might be posing risks to ecological systems. However, the comprehensive and reliable ERA of PECs has been recognized to be hampered by considerable inherent weaknesses of RQ method (Raimondo & Forbes, 2022), for example, the enormous diversity of PECs with different chemical, pharmacological, and toxicological properties; the limited available long-term toxicological data across the lifespan of different surrogate species; and the limited data on residual concentrations of PECs in various environmental matrices, being based solely on laboratory data. It has been considered that the risk assessment of pharmaceuticals and their metabolites based only on MEC and RQ values might underestimate their risks to the environment and humans (Wielens Becker et al., 2020). Therefore, further developing ERA tools is necessary to effectively support PEC risk management decision-making.

Driven by the advanced innovations of artificial intelligence technologies, more and more computer-aided (in silico) approaches as cost-effective, easily accessible, safe, feasible, and promising computer-assisted animal-free testing strategies provide new ideas and opportunities for the rapid prediction and comprehensive assessment of potential environmental risks posed by PECs. Here, we review reports on the application of in silico approaches to support the ERA of PECs as documented in recently published literature.

Methods

In this review, we performed a literature search on the electronic MEDLINE bibliographic database via PubMed using keywords “risk(s) AND (environmental OR ecological) AND (assess OR assessment OR predict OR prediction OR predictive) AND (in silico OR computational) AND (drug(s) OR pharmaceutical(s))”. Only peer-reviewed articles written in English fully published online in recent 5 years (between January 2018 and December 2022) were included. Reviews were excluded from the retrieved articles, and the remaining were reviewed individually to identify and select articles that met the following criteria for analysis: Research and illustrative articles about the ERA of PECs that were active pharmaceutical ingredients identified in the Drugbank database (Wishart et al., 2018); research articles based on in silico platforms, systems, or software for evaluation of study endpoints; and illustrative articles showing research platforms, systems, or software.

Results and discussion

A total of 20 studies employing the in silico predictive approaches to aid in the ERA of PECs have been found. The researchers’ interest and concern in this research topic appear to be sharply aroused from 2 years ago, which is reflected by the yearly distribution of included publications as follows: 1 (2018) (Raitano et al., 2018), 2 (2019) (Miller et al., 2019; Tung et al., 2019), 4 (2020) (Della-Flora et al., 2020; Garcia-Martin et al., 2020; Wielens Becker et al., 2020; Zhang et al., 2020), 8 (2021) (Han et al., 2021; Hua et al., 2021; Huang et al., 2021; Kumar et al., 2021; Liu et al., 2021; Marmon et al., 2021; Saavedra & Duchowicz, 2021; Sanabria et al., 2021), and 5 (2022) (Badry et al., 2022; Han et al., 2022; Kumar et al., 2022; Regnery et al., 2022; Spînu et al., 2022). As an uninvited guest for human society, the COVID-19 crisis broken out in 2020 brought some opportunities regardless of enormous problems it created. In particular, facing the post-pandemic era of information explosion, the application of in silico methods has been very useful and popular in managing different projects associated with COVID-19 crisis (Moradi et al., 2022; Sharifi et al., 2021). Moreover, due to the increasing environmental loads of PECs resulting from intensive pharmaceutical consumption triggered by COVID-19, the ERA of PECs has attracted significant academic and political interest during post-pandemic period (Anand et al., 2022; Guo et al., 2021; Morales-Paredes et al., 2022). These reasons might account for the sudden increase in the in silico studies under the field of ERA for PECs since 2021.

Spatial distribution of included studies

Among the 20 included studies published in the last 5 years, 8 studies (40%) were from Asia (Han et al., 2021, 2022; Hua et al., 2021; Huang et al., 2021; Kumar et al., 2021; Liu et al., 2021; Tung et al., 2019; Zhang et al., 2020), another 8 studies (40%) were from Europe (Badry et al., 2022; Garcia-Martin et al., 2020; Kumar et al., 2022; Marmon et al., 2021; Miller et al., 2019; Raitano et al., 2018; Regnery et al., 2022; Spînu et al., 2022), and the remaining 4 studies (20%) were conducted by researchers from South America (Della-Flora et al., 2020; Saavedra & Duchowicz, 2021; Sanabria et al., 2021; Wielens Becker et al., 2020). As shown in Table 1, the distribution of retrieved articles across countries showed that the highest number of studies were from China (7 studies, 35%), followed by Brazil (3 studies, 15%), the UK (3 studies, 15%), Spain (2 studies, 10%), Germany (2 studies, 10%), Italy (1 study, 5%), India (1 study, 5%), and Argentina (1 study, 5%). Theoretically, the highly shareable feature of digital research resources might provide equal opportunities for researchers all over the world to conduct the in silico studies. Especially for resource limited settings, the use of in silico research tools could help to reduce the consumption of laboratory materials and instructors. It can be speculated that the further expansion of computer and Internet technologies in research field might allow a wider study on in silico prediction of environmental risks posed by PECs among researchers across institutions, areas, and countries, especially for those from low-and middle-income countries.

Table 1 Number of publications per year for 20 studies employing the in silico approaches to aid in the ERA of PECs

Research topic analysis in the application of in silico techniques for prediction of environmental risks posed by PECs

Although the study purposes and in silico models of the included articles were diverse, we found that the research topics are mainly concerned with the following aspects associated to the prediction and evaluation of environmental risks posed by PECs.

Predicting bioaccumulation and biodegradability

Uptake and accumulation of active pharmaceutical ingredients in non-targeted organisms that inhabit the PECs-impacted environment would follow by the potential impairment of organ-specific functions in exposed organisms, once the accumulative levels of PECs were higher than the safety levels for organs (Kunene & Mahlambi, 2023; Nendza et al., 2018; Nozaki et al., 2023). More seriously, through bioaccumulation, PECs enter the food chain, hereby resulting in biomagnification of contaminants in the environment and thus posing potential risks to human health (Kunene & Mahlambi, 2023; Nendza et al., 2018; Nozaki et al., 2023). On the other hand, the exposed organisms have the biodegradability to metabolize the PEC compounds. However, PECs also affect this natural detoxification capability in exposed organisms by negatively impacting their metabolism processes (Kunene & Mahlambi, 2023). Currently, the assessment of bioaccumulation potential of pharmaceuticals is required as an essential and mandatory part of their regulatory ERA in the EU (Regnery et al., 2022). We found that, of the 20 papers included in this review, 35% (7 paper) (Badry et al., 2022; Della-Flora et al., 2020; Garcia-Martin et al., 2020; Miller et al., 2019; Regnery et al., 2022; Sanabria et al., 2021; Wielens Becker et al., 2020) reported the application of in silico techniques to assess the bioaccumulation and biodegradability of PECs in the environment.

The well-accepted bioaccumulation assessments are mainly based on bioconcentration factors (BCFs) (Nendza et al., 2018). However, a standard experimental determination of BCFs needs to use more than 100 fish and is very time-consuming and expensive. According to structural features and physicochemical properties related to the distribution, solubility, volatility and persistence of chemicals in water bodies and aquatic biota, the quantitative structure–activity relationship (QSAR) classifications have been considered promising candidates for the replacement and reduction of in vivo BCF testing on fish (Nendza et al., 2018;Thomas et al., 2018). In fact, the Environmental Protection Agency of the United States (US EPA) and Canadian Environmental Assessment Agency (CEAA) have routinely encouraged to use the QSAR approaches to prioritize and support new chemical registrations in the last decades (Thomas et al., 2018). In order to improve the deficiency of single RQ values in ERA of PECs and perform a more proactive prioritization of complex mixture of pharmaceuticals and metabolites in the environment, Wielens Becker et al., (2020) employed the in silico Prometheus software, in which a battery of QSAR models were used, as a complimentary tool for predicting and ranking the parent pharmaceuticals and metabolites occurring in hospital wastewater samples as possible bioaccumulative compounds in terms of biodegradability. A Brazil research team (Della-Flora et al., 2020; Sanabria et al., 2021) assessed the biodegradability of anti-cancer drugs and their environmental transformation products by in silico QSAR free software package including BIOWIN 1–7 and VEGA (IRFMN model). Prometheus software was used to rank the transformation products as possible persistent, bioaccumulative, and toxic (PBT) compounds. The in silico predictions indicated that the transformation products formed during the degradation process of anti-cancer flutamide and anastrozole were not biodegradable, while some of them were classified near the threshold point to be considered as PBT compounds (Della-Flora et al., 2020; Sanabria et al., 2021). Another study (Regnery et al., 2022) performed the in silico QSAR-based BCF predictions using the regression-based QSAR model BCFBAF (v3.01 EpiSuite, US EPA) to conduct in vitro to in vivo extrapolation for the bioaccumulation assessment of pharmaceutical anticoagulants propranolol, phenprocoumon, and warfarin in fish, under the condition that experimental data on fish metabolism for these PECs were rarely available. Using the high resolution-mass spectrometry coupled to liquid and gas chromatography, Badry et al., (2022) found that an anthelmintic agent oxfendazole was detected in all the 30 livers of German white-tailed sea eagles, a species of apex predators foraging on fish and water birds, with the average level of 40.6 ng/g. In line with this observed finding, when predicting its PBT properties by the JANUS tool (https://www.vegahub.eu/portfolio-item/janus/) based on a battery of QSAR models, oxfendazole had a score P (persistent) of 0.712 (a score > 0.6 indicates that PBT properties are likely to be met) and thus was also shown to be persist in the environment. This finding suggested that the in silico JANUS software might be reliable for predicting PBT properties for certain PECs. Nevertheless, other PECs with a score > 0.6 including pindolol, desethylhydroxychloroquine, sulfadoxine, lidocaine, and lidocaine-N-oxide were found at very low concentrations and detection rates in liver samples, suggesting mismatches between observed exposures and in silico predicted PBT properties. There might be the complexity of exposure events to some PECs for apex predators that are not solely associated to intrinsic chemical properties.

Compared with conventional statistics-based analyses, the machine learning methods have been demonstrated better performance in QSAR modeling considering the complex relationships between structures and PBT properties of chemicals and been used for predicting bioaccumulation and biodegradability of PECs based on a series of training data (Miller et al., 2019; Xu et al., 2022). Miller et al., (2019) evaluated 24 linear and machine learning models to predict the BCFs of PECs in fish Cyprinus carpio and optimally selected a 4-layer multilayer perceptron machine learning algorithm using 14 molecular descriptors among them. The modelled descriptors covered 6 topological ones including radial centric information index (ICR), ramification index (Ram), Narumi harmonic topological function (Hnar), spanning tree number (STN), superpendentic index (SPI), and topological polar surface area (TPSA); 4 constitutional descriptors including number of nitrogens (nN), number of carbons (nC), number of hydrogens (nH), and molecular weight (MW); 3 electrotopological descriptors including maximal electrotopological positive variation (MAXDP), maximal electrotopological negative variation (MAXDN), and mean atomic Sanderson electronegativity (Me), as well as a physicochemical property that is the octanol-water distribution coefficient (logD). When employing this optimized model for further prediction of BCFs for PECs in fish and invertebrate Gammarus pulex, the machine learning was found to show good performance in cross-species prediction of bioaccumulation and thus enable rapid prioritization of PECs during ERA without the need for ethically challenging and costly animal experiments. Being different from simple hydrophobicity models which poorly account for the complexity of PECs, this multilayer perceptron model contributed to better predict the bioaccumulation and its driving molecular descriptors (Miller et al., 2019). Depending on a machine learning support vector machine (SVM) predictor that is trained to distinguish biodegradable from recalcitrant PECs using a vector representation of the compounds, an in silico system named BiodegPred (https://sysbiol.cnb.csic.es/BiodegPred/) (Garcia-Martin et al., 2020) merges computational methods to predict biodegradability of a PEC with others that assess eventual biological toxicity. Even for PECs without previously existed biodegradation data, the BiodegPred using only the chemical structure as the input could provide a prognosis of the chance that a given pharmaceutical compound can eventually be metabolized in the biosphere when it is released into the environment. Then, the blind predictions based on BiodegPred for a set of antiviral agents of medical interest showed that 3 of the 148 studied antiviral agents would be “ready biodegradable,” and most of the antivirals belonged to the “non-ready biodegradable” category (Garcia-Martin et al., 2020).

Predicting the lethal endpoints

Aquatic acute and chronic toxicity based on the half-maximal lethal concentration (LC50) values is a key and common endpoint in ecological risk assessment. We found that, since 2018, there were 3 articles (Han et al., 2021, 2022; Raitano et al., 2018) reporting the in silico prediction for PECs-caused lethal endpoints in aquatic organisms.

Han et al., (2021) constructed and validated the quantitative structure-toxicity relationship (QSTR) models using genetic function approximation (GFA) algorithm based on data sets of 9 quinolone drugs and 7 impurities to build reliable in silico mathematical models for predicting aquatic toxicity of quinolone antibiotics. Results showed that the acute toxicity LC50 values of 11 quinolones predicted using QSTR modelling were in good agreement with the in vivo experimental findings in zebrafish embryos (Danio rerio), suggesting that the constructed model is accurate and has a good predictive power. Then, this research team further developed QSTR models for zebrafish toxicity prediction in the same way and applied models for β-lactam antibiotics prediction (Han et al., 2022). Moreover, this recent study (Han et al., 2022) also used the Ecological Structure–Activity Relationships (ECOSAR) in silico prediction to estimate the toxicity of insoluble penicillins and the fifth-generation cephalosporins for aquatic organisms. The LC50 values for acute and chronic exposure to β-lactams from the ECOSAR software prediction were similar to the QSTR results showing that eight β-lactams might exhibit low toxicity. Among them, mecillinam might have the strongest chronic toxic effect to fish, daphnia, and algae, and ceftibuten might cause acute and chronic toxicity to green algae (Han et al., 2022).

As we mentioned above, the procedure based on RQ, the MEC/PNEC ratio, currently acts as the most common tool in the ERA of PECs (Gouveia et al., 2019; Molnar et al., 2021; Nieto-Juarez et al., 2021; Riva et al., 2019). However, no adequate experimental data supporting the calculation of PNECs, the latter of which were usually calculated based on the LC50 values, as reported in the literature has been believed as a big challenge for the RQ assessment of various PECs (Raimondo & Forbes, 2022). The use of in silico methods contributes to fill the gaps of information about RQ-based procedure. Raitano et al., (2018) employed multiple in silico QSAR modeling platforms to obtain the required and sufficient hazard values for PECs when experimental data were missing or uncertain. In this study (Raitano et al., 2018), the EPI Suite™ (Estimation Program Interface) software based on ECOSAR model, developed by US EPA, was used for predicting acute and chronic toxicity endpoints in algae, Daphnia magna and fish; the T.E.S.T. (Toxicity Estimation Software Tool) and VEGA (Virtual models for property Evaluation of chemicals within a Global Architecture) provided predictive LD50 parameters for fish acute toxicity; the T.E.S.T. gave the prediction of Daphnia magna acute toxicity. Then, the obtained data were used to support the calculation of RQ values of PECs for ecological risk assessment, and results from the empirical study showed the high RQs of PECs including clarithromycin, furosemide, ranitidine, and diazinon occurring in water samples collected from Ledra River, Italy (Raitano et al., 2018).

Predicting developmental toxicity

Among the included articles, the general developmental toxicity, cardiac developmental toxicity, and developmental neurotoxicity of PECs were predicted using the in silico methods in 3 studies (Han et al., 2021; Saavedra & Duchowicz, 2021; Spînu et al., 2022), respectively.

Saavedra & Duchowicz, (2021) developed a QSAR model based on a zebrafish embryo developmental toxicity database provided by the ToxCast™phase I chemical library belonging to the US EPA Computational Toxicology Research Program, and applied it for predicting the half-maximal active concentration (AC50) of 188 antimicrobial, antibacterial, and antiparasitic products in a set of 28,038 non-conformational molecular descriptors, which encode permanent structural features affecting the general developmental toxicity. This in silico method provided a promising way for the low-cost, cruelty-free, high-throughput screening and exhaustive analysis during the ERA for zebrafish embryo developmental toxicity of a structurally diverse set of PECs (especially for new, untested even hypothetical PECs) according to their predictive effects on thousands of potential molecular targets.

After treatment of 12 quinolone drugs for 6–72 hpf, more than 80% of abnormal zebrafish embryos exhibited heart malformations, suggesting quinolone PECs-induced cardiac developmental toxicity (Han et al., 2021). Being consistent with this laboratory-based finding, in silico molecular docking prediction showed good affinity interactions between quinolones and the active binding pocket of zebrafish ERG protein, the human homolog gene of which encodes the Kv11.1 potassium ion channel responsible for quinolone-induced heart QT prolongation (Han et al., 2021), as a molecular target. This study indicated that, if there is a well-identified molecular target as ecological risk endpoint, the in silico molecular docking simulation might be a potential tool for the ERA of PECs.

Considering the susceptibility of developing nervous system to the neurotoxic effects of exogenous chemicals, testing compounds including PECs for developmental neurotoxic potential has been a significant societal and scientific goal (Atzei et al., 2021). Currently, the presence of environmentally relevant PECs such as psychoactive pharmaceuticals in the aquatic environment has been linked to the neurodevelopmental toxicity in fish species (Atzei et al., 2021). However, because the developmental neurotoxicity is a complex process with multiple cellular and molecular paths, current common methods for developmental neurotoxicity testing including measures of gross morphology in the brain, neurochemistry, a range of behavioral assays, and biomarkers of gliosis and cytotoxicity are generally complex and expensive in terms of scientific resources, animal use, and time (Spînu et al., 2022). Spînu et al., 2022 developed a hierarchical multiparameter model of adverse outcome pathway (AOP) network for developmental neurotoxicity testing based on Bayesian machine learning, to predict the probability that a PEC induced each of three upstream common key events of the AOP network and the adverse outcome of developmental neurotoxicity. The modelling workflow could deal with missing values, accommodate unbalanced and correlated data, and follow the structure of a directed acyclic graph (DAG) to simulate the biological path. The developed Bayesian model was found to predict developmental neurotoxicity potential in a data set of 88 compounds including pharmaceuticals, industrial chemicals, and pesticides with an accuracy of 76% (Spînu et al., 2022).

Predicting mutagenicity

Currently, ranging from relatively simple in silico approaches, such as linear statistical techniques, to sophisticated machine learning methods has been widely used to predict and assess the mutagenicity, the ability to cause the mutations in deoxyribonucleic acid (DNA) sequence, of compounds (Kumar et al., 2021). In particular, the mutagenic, genotoxic and carcinogenic potential of considerable pharmaceutical compounds makes them interesting candidates for study especially when entering the environment, food chain, and interfering with the ecosystems as PECs (Sharif et al., 2016). Kumar et al., (2021) developed an in silico model based on the deep neural network (DNN), the latter of which is a subset of optimized machine learning algorithms not requiring manual feature extraction, as an in-depth novel architecture for mutagenicity prediction. When comparing the developed model’s performance parameters in mutagenicity prediction with traditional machine learning methods including SVM, k-nearest neighbor, and random forest, the DNN-based prediction model achieved the highest prediction accuracy of 92.95% and 83.81% with the training and test sets of thousands of compounds including PECs, respectively. In order to predict mutagenicity of anti-cancer drugs flutamide, anastrozole, and their environmental transformation products, the VEGA 1.1.4 software including 4 different carcinogenicity models (CAESAR v. 2.1.9, ISS v. 1.0.2, IRFMN/Antares v. 1.0.0, and IRFMN/ISSCAN-CGX v. 1.0.0) was used according to the Ames test (CONSENSUS model v. 1.0.2) (Della-Flora et al., 2020; Sanabria et al., 2021). Additionally, the QSAR Toolbox v. 4.3.1 software was employed to provide a series of complementary mutagenicity alerts generated for the chemical structures. Results showed that some transformation products of anastrozole, the parent compound flutamide, and most of its transformation products were predicted to give positive alerts concerning the mutagenicity and carcinogenicity endpoints (Della-Flora et al., 2020; Sanabria et al., 2021).

Predicting other toxic effects, such as ototoxicity and hematological toxicity

Considerable environmental chemicals such as carbon disulfide, butyl nitrite, trichloroethylene, styrene, and xylene have been known to result in organ specific toxicity to inner ear. It could be speculated that, once entering into the environment, the common ototoxic drugs including aminoglycoside antibiotics, loop diuretics, salicylates, quinine, and platinum-based anticancer drugs as PECs would be likely to cause hearing loss or tinnitus in exposed animals even humans. Zhang et al., (2020) extracted 897 ototoxic pharmaceuticals and 1715 approved drugs and developed an in silico model using naïve Bayes classifier approach. This model was reported to provide an overall prediction accuracy of 88.7% for the external test set. Furthermore, Huang et al., (2021) proposed a higher-quality data set containing 1102 ototoxic pharmaceutical agents and 1705 non-ototoxic drugs and then developed in silico ototoxicity models using machine learning and deep learning algorithms on online chemical database and modeling environment (OCHEM) that could be used for the ERA of ototoxic PECs. The consensus model with high predictive accuracy and the datasets used for model development were made publicly and freely available at https://ochem.eu/model/46566321.

In addition, as an adverse effect on bone marrow or blood cells, hematotoxicity can result from environmental PECs. On the basis of a high-quality data set containing 632 hematotoxic compounds and 1525 drugs without hematotoxicity, the machine learning and deep learning methods based on structurally diverse chemicals have been used to develop in silico models for the estimation of PECs-induced hematotoxicity (Hua et al., 2021). Among the 35 machine learning models and 3 deep learning models, a model developed with random forest regression and classification algorithm (RFR) and QNPR (quantitative name property relationship) descriptors performed as the best individual one, yielding the balanced accuracy of 0.77 on external validation and the prediction accuracy of 0.83 (Hua et al., 2021).

In silico human health hazard assessment for PECs

PEC residues in the environment can not only cause toxic effects on the exposed microorganisms, plants, and animals, but also enter the human food and drinking water through migration and thus produce potential risks to human health. The current ERA of PECs is mainly based on experimental animal testing as the gold standard, which findings are not fully in good and comparable concordance with human responses (Tung et al., 2019). We found that there were 4 recent publications (Han et al., 2022; Kumar et al., 2022; Raitano et al., 2018; Tung et al., 2019) reporting the in silico human health hazard assessment for PECs based on datasets of biological or toxicological endpoints for human data.

In addition to employ in silico QSAR modeling platforms to support the RQ-based ERA of PECs in algae, Daphnia magna and fish, for the evaluations of consequent health risks of PECs to exposed humans, Raitano et al., (2018) developed ad hoc models through integrating multiple in silico methods and software with traditional toxicological risk analysis to predict missing reference dose (RfD) values for non-cancer health assessment and slope factors for carcinogenic toxicity and thus provided a comprehensive general picture of the (eco)toxicological maps. Similarly, based on the in silico experimental ERA in fish, daphnia, and algae, Han et al., (2022) further predicted the human health hazard values of β-lactam antibiotics by admetSAR (http://lmmd.ecust.edu.cn/admetsar2/) and found that carbenicillin, ceftaroline, ceftibuten, ceftobiprole, and ceftolozane might have hepatotoxicity, and ampicillin might have carcinogenicity for human beings. In order to achieve better predictive performance on PECs as human skin sensitizers over animal testing, Tung et al., (2019) developed a transfer learning-based multitask learning method with high prediction performance and coverage to combine the experimental results to key events of the skin sensitization AOP in humans and implement a freely accessible web server named SkinSensPred (https://cwtung.kmu.edu.tw/predict) that could be used for the identification and prioritization of possible human skin sensitizers from PECs. This in silico method used a tree-based ensemble multitask learning algorithm simultaneously dealing with multiple tasks, to achieve the goal to incorporate and leverage three relevant learning tasks corresponding to the three key events of well-defined skin sensitization AOP including keratinocyte activation, protein binding, and dendritic cell activation (Tung et al., 2019). In addition, Kumar et al., (2022) investigated the pharmacophoric modeling approach for human health risk assessment by screening the blood-brain barrier permeation of pharmaceuticals as xenobiotics. Through a computationally expensive process, an optimized 3D structure-data file library of the pharmaceutical molecules and substrates of P-glycoprotein, the latter of which acts as a gatekeeper in human blood-brain barrier, was generated. The interaction fingerprints created from generated data and combinations of docked and extended connectivity (ECFP4) fingerprints were trained using machine learning algorithms to classify permeable and non-permeable groups. This modeling pipeline was considered a generic framework providing a path to the in silico human risk assessment of environmental chemicals such as PECs (Kumar et al., 2022).

Simultaneously discerning a broad spectrum of potential environmental risks and health effects based on mechanistic endpoints

In the “real world,” each pharmaceutical often interacts with more than one targets, eliciting a cascade of complex biological responses through the related genes that may cause multiple health effect. During the age of Big Data, the advanced bioinformatic methods covering biological information systematically and holistically, for example, transcriptomic analysis and network pharmacology prediction, have enabled the development and application of in silico prediction for multiple biological/toxicological effects of PECs according to the network-based drug-gene-disease relationship. Especially under the condition that ecotoxicity data limitations are common for PECs, the prior understanding of the pharmaceuticals’ biological activities in conjunction with molecular responses associated with mechanistic effects has been believed to provide a foundation for predicting potential ecologically relevant outcomes through in silico models (Ankley et al., 2022). Using mechanistic data of pharmaceuticals to predict adverse ecological effects of PECs is a novel and promising approaches that could be applied to support PEC evaluations (Ankley et al., 2022).

To adequately and comprehensively predict the sophisticated health risks of environmental PECs, Liu et al., (2021) exploited the cell-based drug transcriptomics data from a chemo-centric perspective and proposed an in silico prioritization method for novel toxic health effects of environmental chemicals. By means of an in silico algorithm termed non-negative matrix factorization (NMF) for condensing sparse high-dimensional data, both the genomic and chemotype data matrices of a pharmaceutical were mapped into a low-dimensional latent feature space, thereby allowing the association analysis between the pharmaceuticals’ structural features and transcriptomic data involving multiple gene signaling pathways and different health outcomes. In this predictive model, a matrix of 953 pharmaceuticals and 20,183 genes were included in the biological space, and the chemical space was symbolized by 953 pharmaceuticals and 3534 structural fragments, yielding a total of 13 pivotal types of health effects. Furthermore, this study (Liu et al., 2021) used the potential environmental estrogens from the EPA’s endocrine disruptor screening program (EDSP) as external data to verify the prediction capacity of the established model and found that the precision and recall values reached 0.76 and 0.77, respectively. Moreover, using this model allowing the simultaneous prediction of a range of health effects, more potential health impacts such as cardiovascular effect were verified for some tested endocrine disrupting pharmaceuticals (Liu et al., 2021).

Based on network pharmacology approach, Marmon et al., (2021) developed a novel in silico multiscale model for prediction of the risks to fish posed by anti-inflammatory NSAIDs and their mixtures under realistic exposure scenarios. Through the integration of pharmacological as well as network-centered and target-centered mechanistic considerations into the ERA of PECs, this model provided highly specific predictions of the adverse phenotypes potentially occurring in wild fish associated with exposure to NSAIDs. According to physiological functions of the NSAIDs-perturbing targets, a gene-phenotype anchoring analysis based on the Monarch Initiative platform (https://monarchinitiative.org/) indicated that the environmental risk-based NSAIDs bioactivity network might cause ecological effects on general development, cardiovascular and immune systems, the liver, pancreas, and kidney functions, as well as growth and reproduction in fish. It is worth mentioning that, among all the included articles in this review, this interesting study (Marmon et al., 2021) is the only one with sufficient consideration of environmental residual levels of PECs and their concentrations inside the exposed organism as a key parameter for the realistic and accurate risk predictions. Using the Fish Plasma Model, the real measured concentrations of NSAIDs in water samples such as treated waste-water treatment plant effluents and surface waters were transformed into the predicted effect plasma concentrations in the blood of exposed wild fish. Moreover, when generating hazard-based bioactivity networks of NSAIDs mixtures as PECs in fish based on the ToxCast and ChEMBL platforms, most of the drug targets were found to be shared by two or more NSAIDs, providing a mechanistic rationale for the assessment of potential NSAIDs-mixture effects. Therefore, this study (Marmon et al., 2021) paved a way to explore the potential target-mediated effects of mixtures of PECs co-occurring in the same environmental matrix using in silico tools.

Software platforms and databases used in the included studies

Our review summarized the software platforms (Table 2) and databases (Table 3) used in the included studies. Results showed that a total of 26 software platforms and 15 databases have been employed in the included 20 studies reported the in silico prediction of the environmental risks posed by PECs.

Table 2 An illustrative list of software platforms recently used the in silico prediction of environmental risks posed by PECs
Table 3 An illustrative list of databases recently used the in silico prediction of environmental risks posed by PECs (the self-developed data sets unavailable via the Internet were excluded)

As shown in Table 2, the included studies used different software platforms for different study purposes. In the field of in silico prediction for bioaccumulation and biodegradability of PECs, the EPI Suite™ (Della-Flora et al., 2020; Raitano et al., 2018; Regnery et al., 2022; Wielens Becker et al., 2020), VEGA (Della-Flora et al., 2020; Raitano et al., 2018; Sanabria et al., 2021; Wielens Becker et al., 2020), Prometheus (Della-Flora et al., 2020; Sanabria et al., 2021; Wielens Becker et al., 2020), and QSAR Toolbox (Della-Flora et al., 2020; Raitano et al., 2018; Sanabria et al., 2021) were relatively more commonly employed. A vast majority of these tools are public-access and available for free. Although the included publications are addressing this subject, there is no software platform where all components for ERA of PECs can be seen as a whole.

The success of an in silico predictive model is highly dependent on the datasets or databases from which the related information is extracted. In addition to few studies (Han et al., 2022; Huang et al., 2021; Kumar et al., 2021; Zhang et al., 2020) exclusively using the self-developed data sets unavailable via the Internet, major data sources supporting prediction for the environmental risks of PECs are open to public availability (Table 3). The availability of these public datasets facilitates the collection of a large number of pharmaceuticals/PECs’ structural substances and the associated experimental data for in silico modeling purposes. However, we found that, in general, most of in silico models in the included studies are built on different datasets; therefore, their performances are not comparable with each other. Therefore, it might be necessary to develop standardized and accepted in silico models specially designed to meet key case scenarios (e.g., bioaccumulation and biodegradability) within the prediction of environmental risks associated with PECs, through integration of pharmaceutical data and specialized tool sets for ERA.

Conclusions

As high throughput screening tools for potential environmental risks posed by a wide range of PEC candidates in an economical, fast, and time-saving manner, the in silico approaches have been considered to be promising to efficiently and comprehensively evaluate drug-related data sets/databases for potential ecological risks posed by pharmaceuticals when occurring in the environment. This review focused on the current knowledge on the application of in silico approaches to support ERA of PECs reported in 20 articles that have appeared since 2018. We found that the researchers’ interest and concern appeared to be sharply aroused by the outbreak of COVID-19 crisis, but the included studies were from only 8 countries around the world. Recently, a great diversity of in silico techniques have been widely employed for prediction of endpoints for ERA of PECs, including bioaccumulation and biodegradability, lethality, developmental toxicity, mutagenicity, and other toxic effects, such as ototoxicity and hematological toxicity. Moreover, several researchers have in silico assessed the potential human health hazards of exposure to PECs based on human data. In addition, transcriptomic analysis and network pharmacology approaches have been used to simultaneously discern a broad spectrum of potential environmental risks and health effects of PECs based on the network-based drug-gene-disease relationship. However, considerations of environmental exposure concentration for PECs and interactions among mixtures of PECs were not sufficiently addressed in the included studies. Despite a great diversity of software platforms and databases that the scientific community recently used for in silico prediction of the environmental risks posed by PECs, hardly any among them is specific to ERA of PECs. In conclusion, in silico prediction of the environmental risks posed by PECs is still in its infancy with limited examples, thus creating both opportunities and challenges for researchers.