Introduction

Carcinogenicity is a very important hazardous property for the safety evaluation of chemicals, and in the last few decades, the development of quantitative structure–activity relationships (QSARs), together with in vitro or in silico tests, has become important for regulatory use. Currently, several classification models are available to predict carcinogenicity in murine, but few models quantitatively assess carcinogenicity in humans. The cancer slope factor (CSF), a parameter describing potential carcinogenicity used for human risk assessment, has never been modeled for both oral and inhalation exposure. Therefore, the need to characterize the effects of chemicals is considered a priority research area by all environmental and health-related institutions in many countries, evaluating chemical carcinogenicity based on the CSF, a key parameter in health risk assessment (Toma et al. 2020). Although several QSAR models have been proposed for this purpose, few models can quantitatively evaluate carcinogenicity.

Exposure to a chemical or mixture occurs in the environment, residence, and workplace, but diet, drugs, and lifestyle can also be important co-triggers (Li and Suh 2019). Adverse effects include chronic disease and cancer, which today is a major public health problem with huge incidence. Although the procedure is complex, costly, and time-consuming, animal models are the most widely used investigation method, and are in great demand (Madia et al. 2016). Recently, various non-animal models have been proposed as alternative or complementary methods to evaluate carcinogenicity to reduce animal experiments, evaluation time, and cost; and these methods include in silico methods, such as QSAR models and expert systems (Golbamaki et al. 2016; Yamane et al. 2016). Most in silico carcinogenicity models are tools used to predict whether a chemical is carcinogenic in an animal model (Zhang et al. 2017). Many of these models have already been implemented as license-based or freely available software tools, but models for oral and inhalation slope factors (SF) used for the human risk assessment of environmental contaminants have not yet been developed (Raitano et al., 2018; Bossa et al. 2018). The SF is an upper bound estimate of the slope of the dose–response curve in the low-dose regimen for carcinogens, and is used to assess the lifetime increase in incidence. CSF is used to estimate cancer risk associated with exposure to carcinogens or potential carcinogens, with a 95% confidence limit for increased cancer risk due to lifetime exposure to a chemical by ingestion or inhalation (Basic Information about the Integrated Risk Information System 2023). Therefore, the higher the slope value, the higher the carcinogenic potential.

If the chemical is a known or probable carcinogen to humans, a toxicity value (i.e., a slope factor) is calculated that quantitatively defines the relationship between dose and response. Since risk at low exposure levels is difficult to measure by animal experiments or epidemiological studies, the establishment of a gradient factor is usually necessary to adapt the model to available data sets, and to extrapolate from the relatively high doses administered in the experiment (Risk Assessment for Carcinogenic Effects 2022).

The difference from previous studies is that in most environmental risk management studies, the CSF is used to calculate the excess carcinogenic risk to determine the level of risk to the human body, and the efficiency of the process of selecting chemicals for carcinogenic inhalation toxicity tests is improved by comparing their CSFs. Further, research to contribute to establishing a chemical selection system for a new inhalation carcinogenicity test has not yet been attempted.

It is necessary to select substances for carcinogenic inhalation toxicity test in a new aspect through comparison of their CSFs (as a carcinogenic coefficient, the carcinogenic potential) used in the hazard and risk assessment of chemicals, and efficient carcinogenesis. It was necessary to construct a database of the various aspects necessary to select the target chemicals for the inhalation toxicity test. In this study, I tried to estimate which chemicals are likely to be carcinogenic, and, if so, the carcinogenic efficacy for humans and laboratory animals by oral and inhalation slope factors can help evaluate this. Making a model version of these findings available free of charge would greatly aid health risk assessment by making it easier to screen for carcinogenic chemicals. By comparing CSFs centering on chemicals that have become social issues or published in various papers, I tried to build one of the most efficient working systems in the process of selecting chemicals for carcinogenicity inhalation toxicity tests.

Materials and methods

Comparison of the CSFs of chemicals contributed to the efficiency of the selection process of carcinogenic inhalation toxicity test target chemicals, and served to establish a new target selected system for inhalation carcinogenicity test. The list centers on chemicals that have become social issues, or that have been published in various papers; in doing so, I tried to contribute to the list of priority chemicals for carcinogenicity testing by the CSF value of each chemical.

Target chemicals were selected using literature search, such as Google Scholar, PubMed, ScienceDirect, etc., among the chemicals set by the Ministry of Employment and Labor in Korea as existing chemicals; and the CSF of each chemical was determined using various sites and programs, including EPA Comptox Dashboard and VEGA Hub QSAR (ver. 1.2.3). Values were searched and analyzed separately for oral and inhalation. VEGA Hub QSAR (v. 1.1.5) stands for “Virtual models for property Evaluation of chemicals within a Global Architecture”, and is a download-based package developed and distributed by Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Laboratory of Environmental Chemistry and Toxicology, Italy.

Gene expression analysis of each chemical was used to obtain the CSF value using the Comparative Toxicogenomics Database (CTD), which was analyzed for inhalation carcinogenicity according to CSF value priority estimation, and a database (chemical list) was made possible.

Result

A population (of finally 960 chemicals) was selected based on chemicals that became social issues or were published in various papers, the chemicals with a circulation volume of 1,000 tons or more were prioritized as the primary DB, and the SMILES form of each chemical was entered for continued searching.

Gene expression related to each chemical substance searched on the CTD site (ctdbase.com) was summarized, and the CSF value for each chemical was found on the EPA Computational Toxicology site.

Using the VEGA Hub program (ver. 1.2.3), the CSF values for each chemical substance were predicted on an in silico basis, and each predicted value was classified into oral and inhalation. In addition, using the in silico carcinogenicity classification model in the VEGA Hub and Protox-II programs (tox-new.charite.de/protox_II), the carcinogenicity of each chemical was predicted.

Based on KOSHA-MSDS, GHS classification of each chemical and carcinogen classification done by IARC, NTP, EPA, OSHA, ACGIH, NIOSH, etc., were referred. Additionally, reference values for the CSF of each chemical were classified and organized using the OncoLogic 9.0 program.

All of the above results were summarized and presented in an Excel file (as an appendix), and the priority of inhalation carcinogenicity was estimated by comparison with gene expression and CSF values, especially those with large inhalation-related values, and the carcinogenesis of priority chemicals for inhalation.

Table 1 shows the chemicals found to express cancer-related genes with a CSF value of 1 or more, expressed as VEGA in silico inhalation [1/(mg/kg-d), where d = day]. This table shows a total of 17 chemicals. Table 2 shows the chemicals with a CSF value of 1 or more, and gene expression with the carcinogenesis-related signaling pathway (Fig. 1), with a total of 44 chemicals being shown.

Table 1 Priority substances for inhalation carcinogenicity
Table 2 Priority substances for inhalation carcinogenicity
Fig. 1
figure 1

Oncogenesis-related signaling pathway (hsa05200). Sourced from the Kyoto Encyclopedia of Genes and Genomes (KEGG), https://genome.jp/pathway/hsa05200. Adapted with permission

Table 3 shows the results expected to be carcinogens, excepting the false positives in Table 2, and there are a total of 11 chemicals.

Table 3 Substances expected to be carcinogens (11 chemicals), excepting the false positives in Table 2

Discussion

In this study, an integrated in silico approach was attempted for the evaluation of chemical carcinogenicity potential, including classification and models for inhalational and oral human carcinogenicity based on CSFs. The CSF, a parameter with potential carcinogenicity used for human risk assessment, has never previously been adapted for both inhalation and oral exposure. Cancer potency factor (CPF) or CSF is a parameter that is used during the quantitative risk assessment of a chemical or drug that is evaluated as a carcinogen. Cancer efficacy is measured as the slope of a straight line generated during linear extrapolation of the low-dose region in a chemical dose–response curve (Farris and Ray 2014).

In silico models are evolving toward integrating multiple perspectives, and this integration will allow better utilization of the available data and information to tackle more difficult tasks. Users may be interested in the application of these tools, the evaluation of specific chemicals, or the evaluation of a large group of chemicals, and VEGA’s development approach best addresses these user needs, reducing the barriers between different approaches (Benfeati et al. 2019).

The oral slope factor (OSF) is used to quantitatively estimate the carcinogenic efficacy or risk associated with chemical exposure through the oral route (Kar et al. 2012). The overall risk associated with chemical exposure is determined by combining quantitative estimates of chemical exposure with the known effects. For chemicals that cause carcinogenicity, OSF and inhalation unit risk are used to estimate the risk associated with carcinogenicity or exposure by the oral or inhalation route, respectively (Rim 2020).

In this study, the population (of finally 960 chemicals) was selected based on substances that became social issues or were published in various papers, and the contents of gene expression related to each chemical substance searched on the CTD site were summarized. EPA Computational Toxicology was conducted focusing on searching for CSF values, such as finding CSF values for each chemical substance on the site, and organizing the contents to be searched. However, there were not many substances with those values presented, so we used the VEGA Hub program to conduct in silico analysis. The CSF values for each chemical substance based on this study were predicted, and each predicted value was divided into oral and inhalation, and the contents were summarized. In addition, using the in silico carcinogenicity classification model in the VEGA Hub and Protox-II programs, the carcinogenicity prediction of each chemical substance was summarized.

This study simultaneously considers the CSF value used in the method of multiplying the lifetime exposure by the carcinogenic potential to find the excess carcinogenic risk in both the expression of genes, and the hazard and risk assessment of chemicals. As a new attempt to select a target substance for a toxicity test, it was intended to be used effectively. On the other hand, in VEGA Hub QSAR, when the result is negative but the result is statistically positive, it is termed a “false positive”; and when the result is negative, even though it is statistically positive, it is termed a “false negative”. In this study, carcinogen was predicted by the CSF values, but it was judged that it would be possible to distinguish false positives depending on whether the experimental value was a carcinogen. Sensitivity and specificity are concepts to describe the accuracy of a test for reporting with or without a condition. The terms ‘sensitivity’ and ‘specificity’ were introduced in 1947 by Jacob Yerushalmy, a biostatistician (Yerushalmy 1947). Sensitivity (true positive rate) represents the probability of a conditionally positive when it is positive, while specificity (true negative rate) represents the probability of a conditionally negative when it is indeed negative.

Table 4 shows the changes in sensitivity and specificity in predicting carcinogenicity through VEGA Hub QSAR. When only carcinogenicity was predicted through the QSAR, the sensitivity was 53.85%, but when CSF was additionally considered, it increased to 58.82%; and when carcinogenic gene expression was additionally considered, it increased to 72.73%. In addition, when only carcinogenicity was predicted through QSAR, the specificity was 44.32%, but when CSF was additionally considered, it increased to 86.15%; and when carcinogenic gene expression was additionally considered, it slightly decreased to 80.56% (Table 4).

Table 4 Changes in VEGA Hub QSAR carcinogenicity predicted sensitivity and specificity

This is an indicator that when selecting a substance to be tested for carcinogenicity by considering its carcinogenic potential together with QSAR, it is possible to distinguish true negative, as well as true positive, show a significant improvement. Whereas it is not possible to find the expression of genes related to carcinogenesis in all chemicals, it is judged that additional consideration and research on methods for improving sensitivity and specificity using QSAR, etc., are necessary.

As for the expected effect and utilization plan of this study, it contributes to the selection of priority chemicals for efficient inhalation carcinogenicity, and a new attempt was made by estimating the CSF value using computational toxicology and toxicogenomics in chronic/carcinogenic inhalation toxicity. This CSF value can be used as a new frame for selecting test chemicals for these inhalation tests.

It was considered necessary to establish a DB in various aspects, such as the selection of chemicals to be tested for carcinogenicity in a new aspect through the comparison of CSF (as a carcinogenic potential) used in the hazard and risk assessment of chemicals. By comparing the CSF values that have become social issues or published in various papers, I sought to contribute to the list of chemicals subject to carcinogenicity testing. Based on the obtained CSF value, gene expression analysis of each chemical, and toxic gene expression analysis of the CTD, inhalation carcinogenicity priority estimation, and a DB (a chemical list) were made according to the CSF value. All the contents were organized and presented in an Excel file, and the priority of inhalation carcinogenicity was estimated through comparison with gene expression, focusing on CSFs, especially those with large inhalation-related values.

In this study, the change in sensitivity and specificity in predicting carcinogenicity through VEGA Hub QSAR when only carcinogenicity was predicted through the same QSAR was 53.85%, but when CSF was additionally considered, it increased to 58.82%; when the expression of oncogenes was additionally considered, it further increased to 72.73%. In addition, when only carcinogenicity was predicted through QSAR, the specificity was 44.32%; but when CSF was additionally considered, it increased to 86.15%; and when carcinogenic gene expression was additionally considered, it slightly decreased to 80.56%. This is an indicator that when selecting a substance to be tested for carcinogenicity by considering its carcinogenic potential together with QSAR, it is possible to distinguish true negative, as well as true positive, in predicting carcinogenicity. When the expression of carcinogenesis-related genes was also considered, the identification of true positives increased further, but the identification of true negatives did not show much improvement. On the other hand, the expression of carcinogenesis-related genes cannot be found in all chemicals, so it is judged that additional consideration and research on this are necessary.