Keywords

1 Introduction

Drug-induced liver injury (DILI ) poses a significant challenge to the medical and pharmaceutical communities as well as regulatory agencies . Many drugs have failed during clinical trials, and over 50 drugs were withdrawn from the worldwide market due to the concern of DILI risk [1]. Because of its significant impact on public health, a series of guidances were published by regulatory agencies to request that the pharmaceutical industry better assesses DILI risk during drug development , including the US Food and Drug Administration (FDA) ’s guidance “Drug-Induced Liver Injury: Premarketing Clinical Evaluation ” and the European Medicines Agency (EMA)’s “Non-clinical guidance on drug-induced hepatotoxicity ” [2].

One significant challenge encountered by drug developers and regulators stems from the lack of sensitive screening methodologies to identify DILI signals at the early stage of drug development , especially before the first-in-human testing [3]. While animal studies remain the “gold standard” of testing strategies in preventing potentially toxic drug candidates from entering clinical trials [4,5,6], it is not perfect and sometimes fails to detect hepatotoxic drug candidates; a retrospective analysis revealed that such tests failed in about 45% of DILI cases found in clinical trials [7]. In one notorious example, five subjects in a phase 2 clinical trial experienced fatal hepatotoxicity induced by fialuridine, while this investigational nucleoside analogue showed no liver damage in animal studies [8]. There is unmet need to more reliably predict risk for DILI in humans and to overcome current limitations.

Many worldwide efforts have been launched to better understand and address DILI issues. In the USA, the drug-induced liver injury network (DILIN) was funded by National Institute of Healthy since the year of 1995 and is still today actively collecting and analyzing cases of severe liver injury caused by prescription drugs, over-the-counter drugs, and alternative medicines, such as herbal products and supplements. Similar government supported drug-induced liver injury network efforts were recently established in Europe funded by European cooperation in Science and Technology (http://www.cost.eu/COST_Actions/ca/CA17112). The US FDA has a long-term effort to improve drug safety by better assessing pre-marketing and post-marketing data for identifying signs of toxicity. At the National Center for Toxicological Research, we have developed the Liver Toxicity Knowledge Base (LTKB) which contains diverse liver-related data such as drug properties , DILI mechanisms, and drug metabolism . that can be utilized to develop new models for assessing the risks for DILI in humans [1, 5, 9,10,11,12,13,14,15,16,17,18,19,20,21]. In this chapter, we will introduce our continuing efforts toward the development of computational models for the prediction of DILI risks in humans. First, we will present the drug label-based approach to annotate the risk for DILI associated with individual drugs, and then based on these annotations, we developed a panel of predictive models that could be used to assess drug candidates for their potential to cause DILI risk before human testing or during clinical trials.

2 Annotation of DILI Risk for Marketed Drugs

Annotation of DILI risk for drugs is challenging. Drugs could cause significantly different scales of DILI risk even when their chemical structures are similar. For example, alpidem and zolpidem both are anxiolytic drugs derived from the imidazopyridine family used as sleeping medication. These two drugs have similar chemical structures but distinct hepatotoxicity (Fig. 13.1): Alpidem was withdrawn due to hepatotoxicity while zolpidem is still widely used in clinical practice with rare hepatotoxicity observed. Drugs withdrawn from market due to hepatotoxicity and those without hepatotoxicity observed represent two extremes within the spectrum of the risk for humans. Most drugs are located within the middle of spectrum depending on the associated DILI risk.

Fig. 13.1
figure 1

Distinct hepatotoxicity observed between alpidem and zolpidem even though their chemical structures are similar

The DILI annotation discussed here refers to the classification of risks of DILI exposure to the human population associated with the drug treatment for various diseases. An improved annotation of DILI is vital and largely affects the accuracy and utility of a predictive model [22]. At least three attributes including severity , causality , and incidence need to be considered when assessing a drug’s potential to cause DILI [1]. However, annotating a drug’s DILI risk is not trivial in clinical practice [23] due to several hurdles to be considered, i.e., (1) the uncommon occurrence of DILI , (2) the various complicated clinical DILI manifestations, (3) the deficiency of accurate biomarkers for DILI diagnosis , (4) the complications in causality adjudication, (5) and the severe under-reporting of DILI cases.

There is not a single resource which could provide all the information required for an accurate DILI annotation [1]. The research community has put great efforts to address this challenging issue as summarized in some reviews [10]. Overall, the approaches to annotate DILI risk are either based on case reports or on monograph. Case reports can be collected by on-going DILI research projects such as US DILI network and Spain DILI registry, reported in literature [24,25,26], or retrieved from the FDA ’s adverse event reporting system [27,28,29]. Monographs are written by experts based on collection of evidence from a variety of sources, such as the FDA drug labeling [1], the Physicians’ Desk Reference [30], and the US pharmacopeia. The information in the monograph documents was authoritative but not updated as frequently as the case reports [31,32,33]. Given the lack of a “gold standard” that defines DILI risk, certain drugs could have diverse annotations due to the different definitions and data sources for annotations [34]. A comparison among different annotations was reported [35,36,37]. Overall, the agreements among annotations are acceptable, and normally a higher concordance among hepatotoxicity drugs was present as compared to the non-hepatotoxicity drugs [10, 15, 38].

We selected FDA -approved drug labeling as the main supporting evidence to annotate drugs for their DILI risk for humans. Drug labeling is an authoritative document summarizing drug safety information based on the comprehensive evaluation of data from preclinical studies, clinical testing, post-marketing surveillance , and publications in literature. The information within drug labels summarizes the consensus and serious thoughts from experts at that time with the consideration of all three criteria (i.e., severity , causality , and incidence ) mentioned above [1]. We developed a schema to gather the information from FDA -approved drug labeling to annotate DILI risk and created a benchmark dataset which contained 287 drugs that were categorized into three levels of DILI severity : most-DILI -concern, less-DILI -concern, and no-DILI -concern [33]. Specifically, the 137 drugs categorized as most-DILI -concern are those that were suspended, withdrawn , or issued a black box warning due to hepatotoxicity or had gotten warnings and precautions with moderate or severe DILI concern. Eighty-five drugs categorized as less-DILI -concern had been issued warnings and precaution with mild DILI concern or only recorded hepatotoxicity in the Adverse Reactions section of drug labels. Sixty-five drugs listed as no-DILI -concern are those with no DILI concern mentioned in their drug labels.

The safety data contained in drug labeling are not perfect. A major concern of drug labels was weakness in causality assessment [1], i.e., the definite causal relationship is not mandatorily required for drug labeling, and the regulators were authorized by law to issue a warning when a clinically significant hazard is identified for a drug with reasonable evidence of causality (http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfCFR/CFRSearch.cfm?fr=201.57). Additionally, any modification or updating of the safety information in drug labeling is a stringent and lengthy process that likely causes a time lag from the most updated clinical findings [39]. Meanwhile, case reports could have better timing and be more sensitive to any potential alert signals caused by drugs. Therefore, by incorporating the case report information derived from up-to-date literature and on-going DILI research projects such as the US DILIN project, the drug labeling-based annotation of DILI risk could be further improved.

Upon these considerations, we further refined the labeling-based annotation schema by weighing evidence of case reports together with the information from FDA -approved drug labeling to improve the accuracy of DILI annotation . More specifically, the refined annotation schema was built upon a collection of well-vetted cases (verified via thorough case evaluation by DILI experts) and adjudicated cases (verified using the standardized clinical causality assessment system, i.e., Roussel Uclaf Causality Assessment Method [40]). With this collected causality information, the DILI risk of individual drugs was re-evaluated by complementing drug labeling with available evidence of verified causality . This new schema classified drugs into four categories as detailed as below:

  • Withdrawn drugs and those with a black box warning for severe liver injury were classified as verified most-DILI -concern (vMost-DILI -concern) drugs because they are consistently classified as high DILI risk among several published datasets.

  • For those drugs which had been warned with severe or moderate DILI occurrence in their labels (i.e., isoniazid) [1], the verification process of causality is needed for the assessment in the new schema: The causality verified drugs will be classified as the vMost-DILI -concern, otherwise will be reassigned as “Ambiguous DILI -concern.”

  • Similarly, the less-DILI -concern drugs could be reassigned as verified less-DILI -concern (vLess-DILI -concern) or “Ambiguous DILI -concern,” which will depend on whether evidence of causality is available.

  • The verified no-DILI -concern drug (vNo-DILI -concern) can be confirmed only if the drug was not verified as a cause of DILI in literature and no DILI mentioned in its drug label .

The refined schema was applied to 1036 marketed drugs approved by the FDA before 2010, namely DILIrank dataset , including 192 vMost-DILI -concern drugs, 278 vLess-DILI -concern drugs, 312 vNo-DILI -concern drugs, and 254 Ambiguous DILI drug. Notably, given that the existing knowledge will advance over time, the schema we applied in the DILIrank will continuously be updated along with the newly reported DILI cases.

3 Predictive Models Developed at NCTR

Developing a risk management plan to improve prediction of a drug’s hepatotoxic potential is a long-term effort of the research community [41], and predictive models or biomarkers are essential for assessing the risk for DILI in humans at early stages of drug development , even before the first test in humans. The developments of predictive models for DILI are nicely summarized in several seminal reviews [42]. Here, we briefly introduced some continuing efforts at the FDA ’s National Center for Toxicological Research for the developing models to predict the risk for DILI in humans, such as the “rule-of-two ” model , DILI score model , and conventional and modified Quantitative structure–activity relationship (QSAR ) models.

3.1 The “Rule-of-Two” Model [11]

Many drugs withdrawn from the market or issued a black box warning due to hepatotoxicity were prescribed at a daily dose of 100 mg or greater [43, 44] while drugs given at a lower daily dose of <10 mg experienced less severe events, suggesting a potential relationship between hepatotoxicity risk and daily dose [31, 45]. Consequently, some experts recommended avoiding the development of drugs requiring a high daily dose to reduce the potential adverse events [42, 46, 47]. Meanwhile, many drugs given at high daily doses are found with little or no risk of DILI , therefore, suggesting that daily dose alone is not a reliable approach to guide drug development , regulatory application, and clinical practice.

Besides daily dose , lipophilicity is an important physicochemical property [48] and is frequently modulated to improve bioavailability and pharmacological activity. Lipophilicity could affect hepatocyte uptake and drug ADMET (i.e., absorption, distribution , metabolism , elimination) behaviors [49], and many lines of evidence also implicate lipophilicity to be linked to drug toxicity. Nonetheless, it was unclear whether the combination of daily dose and lipophilicity related to risk for DILI in humans.

To better examine the combined effects of daily dose and lipophilicity , a data repository of 164 drugs labeled for their liver liabilities derived from the LTKB-benchmark dataset were used, including N = 116 most-DILI -concern drugs and N = 48 no-DILI -concern drugs. Lipophilicity was measured by the octanol-water partition coefficient (i.e., logP) which was calculated from the atomic-based prediction of AlogP using Pipeline Pilot (version 8.0, Accelrys Inc, San Diego, CA), and it was categorized into three groups: <1, 1–3, and ≥3 as recommended by literature (13). Daily doses were majorly retrieved from the WHO ’s ATC database (http://www.whocc.no/atc_ddd_index) and were divided into the groups of <100 mg, 10–100 mg, and ≥100 mg per day as suggested by literature [43, 46].

When the 164 drugs of the dataset were put into the scatter plot of daily doses and logP, the upper right quadrant at a high daily dose and a high logP was majorly distributed with most-DILI -concern drugs. Few no-DILI -concern drugs appeared in this region. The relative risk for DILI associated with various doses and logP constellations was further assessed. Specifically, the subgroup of daily doses ≥100 mg and logP ≥ 3 was associated with a significantly higher proportion of hepatotoxic drugs as compared to the rest of subgroups altogether (96% vs. 41%, odds ratio: 14.05, P < 0.001). The analysis demonstrated that a statistically significant association between logP and risk for DILI was observed for the drugs given at daily doses of ≥ 100 mg, while no statistically significant relationship between logP and hepatotoxicity was obtained for the drugs given at daily doses of less than 100 mg.

Similar findings were observed from another independent dataset of 179 oral drugs that 85% of the “rule-of-two ” positives are associated with hepatotoxicity as compared with 59% in the “rule-of-two ” negatives (odds ratio: 3.89, P < 0.01). These evidences together suggest that a drug given at a daily dose of ≥100 mg and with a high logP ≥ 3, namely as the “rule-of-two ,” is associated with a significant high risk for DILI in humans.

The “rule-of-two ” is a simple but effective model to predict the risk for DILI in humans and has been independently evaluated by the drug safety scientists. In a study by Paul Leeson from UK [50], the “rule-of-two ” was applied to predict the drugs that failed in drug development due to hepatotoxicity in humans, and 13 of 22 (59%) failed drug candidates were found as “rule-of-two ” positives (see Table 13.1). This practice demonstrated that the “rule-of-two ” model can be applied to assess drug candidates with similar or even better performance than that among marketed drugs, even though the chemical spaces of drugs candidates in development has significantly shifted from those marketed drugs approved decades ago. Furthermore, another study from a Pfizer team found that the “rule-of-two ” model performs better than the three mechanistic endpoints they selected (i.e., cytotoxicity , mitochondrial impairment, and BSEP inhibition) by single, dual combination or triple combinations when evaluated by a total of 125 drugs [51]. Moreover, the “rule-of-two ” model was also applied to the direct-acting antiviral for the treatment of chronic hepatitis C and successfully identified the DILI potential associated with Vieraki Pak [52].

Table 13.1 “Rule-of-two model ” for prediction of drugs that failed in clinical development due to hepatotoxicity in humans

3.2 DILI Score Model [12]

The “rule-of-two ” model provides added value for predicting DILI risk in humans but could not foresee degree of severity [53, 54]. Additionally, besides dose and lipophilicity , some other mechanistic factors could contribute to the predictive models , facilitating the development of quantitative metrics [55].

Covalent binding of reactive metabolites (RM) is an important toxicity mechanistic factor that could cause direct cellular toxicity or modulate immune reactions [56]. Numerous drugs were reported to generate RM, although their causative relationship for human DILI is still controversial and inconclusive [57]. However, some reports suggest that protein adducts caused by RM seen with drugs are not necessarily associated with liver injury [58,59,60]. Furthermore, a large-scale retrospective analysis demonstrated that the level of covalent binding has no correlation with incidence of liver toxicity observed in vivo in preclinical studies [57]. Even though, considering the possible toxic implications, industry still strongly recommend to minimize the potential of RM formation for drug [61,62,63] with a target threshold of <50 pmol of RM bound to 1 mg protein [64].

We applied logistic regression analysis to investigate the association between daily dose , logP, RM formation, and DILI risk by using N = 192 FDA -approved drugs. The multivariate regression analysis suggested that daily dose , logP, and RM formation all contributed independently to predicting DILI risk, and their contributions were ranked by the order of RM > daily dose /Cmax > logP per the regression coefficients. Consequently, we developed a DILI score model [12] derived from daily dose , logP, and RM: 0.608 * loge(daily dose /mg) + 0.227 * logP + 2.833 * (RM formation); here, RM was assigned as 1 or 0 based on whether a drug could produce reactive metabolites . As an example, alpidem given at a daily dose of 150 mg/day has a logP of 5.6 and produces RM which resulted in a DILI score of 0.608 * loge(150) + 0.216 * 5.6 + 2.833 * 1 = 7.15. Meanwhile, zolpidem (a drug with the same mode of action, similar chemical structure, and preclinical safety profile but with distinct liver toxicity) has a logP of 1.20 and is given at a daily dose of 10 mg, which resulted in a DILI score of 4.51.

The developed DILI score model was evaluated by three independently published datasets assessing its capability to predict the severity of DILI risk in humans. The first dataset was derived from the LTKB-BD with a total of N = 354 drug annotated with DILI potential, including 124 most-DILI -concern drugs, 162 less-DILI -concern drugs, and 68 with no-DILI -concern. The second dataset with N = 227 drugs retrieved from Greene et al. [24] had N = 130 human hepatotoxicity drugs, N = 44 drugs with weak evidence, and N = 53 drugs with no evidence. The third dataset comes from Suzuki et al. [26] and considered the severity of human hepatotoxicity , of which a total of 182 drugs were obtained consisting of N = 35 withdrawn drugs, N = 61 with reported acute liver failure cases, and N = 86 general DILI drugs. Overall, an increased DILI score significantly correlates with the severity of liver injury. In the first dataset , the DILI risk score decreased in the order of most-DILI -concern > less-DILI -concern > no-DILI -concern [1], and each of the subsequent comparisons was statistically significant (P < 0.001). In Greene et al. [24] dataset , DILI score also correctly predicted drugs with evidence for overt human hepatotoxicity having significantly higher DILI scores than those with weak evidence (P < 0.001) and not unexpectedly followed those without any evidence for developing DILI (P < 0.001). For the data from Suzuki et al. [26], the algorithm also correctly predicted severe DILI cases (P < 0.001).

Furthermore, the DILI score model was applied to N = 165 clinical cases collected from NIH LiverTox database (https://livertox.nih.gov/), and it was demonstrated that the DILI score correlated with the severity of clinical outcome. The DILI score model was also applied to successfully distinguish some drug pairs such as minocycline/doxycycline, trovafloxacin/moxifloxacin, and benzbromarone/amiodarone, which are defined by their molecular structure (tanimoto similarity > 0.5) and similar mode of action but discordant toxicity [65].

3.3 Conventional QSAR [13]

QSAR models have been extensively applied to predict drug-induced liver injury due to their ability to produce rapid results without requiring physical drug substance [22, 24, 66, 67]. So far, most of the QSAR-DILI models’ report limited predictive performance, with accuracies of approximately 60% or less, especially when the models are challenged by external validation sets. We implemented an improved strategy to develop the QSAR model for predicting DILI in humans using a robust annotation of DILI risk relying on FDA -approved drug labeling and applying an extensive modeling validation strategy to ensure the model performance was sustainable and better than by chance.

Our conventional QSAR was developed by using a decision forest (DF) algorithm to correlate the chemical structures with their DILI risk in humans based on a set of drugs as the training set. The DF algorithm is a supervised machine learning technique utilizing a modified decision tree model by employing a consensus technique to combine multiple heterogeneous decision trees to achieve a more accurate predictive model . The DF algorithm is developed by our laboratory, and the software is publicly available @ https://www.fda.gov/ScienceResearch/BioinformaticsTools/DecisionForest/default.htm. Meanwhile, the chemical structures of drugs were codified into a digital format (i.e., chemical descriptors ) as the input for the machine learning algorithm DF. Here, we utilized the Mold2 molecular descriptors to transform the 2-dimensional chemical structures into 777 chemical descriptors . Mold2 is also developed by NCTR and freely available at https://www.fda.gov/ScienceResearch/BioinformaticsTools/Mold2/default.htm.

The training set to develop the QSAR model included 197 drugs (NCTR training set), which were annotated by FDA -approved drug labeling as discussed previously. The drug label-based DILI annotation proved to be robust and consistent as compared to other annotations [37], which is critical for the development of an improved QSAR model . The developed models were evaluated by internal and external validations. Internal validation employed a 2000 run of 10-fold cross-validation based on the NCTR training set. External validation of the QSAR models was applied to 3 different datasets with a total of 438 unique drugs: NCTR validation dataset with N = 190 drugs, Greene et al. dataset with N = 328 drugs, and Xu et al. dataset with N = 241 drugs. The validation results in Table 13.2 show that when using the NCTR annotated training or validation set, the predictive performance of the QSAR model had an accuracy of 69.7% for internal cross-validation and 68.9% for external validation . Meanwhile, the external validation assessed by Greene and Xu et al. datasets was at accuracies of 61.6 and 63.1%, respectively. The performances evaluated by different datasets are largely consistent, the occasional variations might reflect the quality of annotation , and the diverse drugs included in the datasets.

Table 13.2 Conventional QSAR performance evaluated by cross-validation and independent validation

Besides the QSAR model for predicting two classes of DILI risk, we also developed another model to assess the three classes of DILI risk (i.e., most-DILI , less-DILI , and no-DILI ) [68]. The model was developed by using decision forest (DF) and Mold2 structural descriptors together with DILIrank dataset with >1000 drugs evaluated for their likelihood of causing DILI in humans, of which >700 drugs were classified into three categories used for the model development. Similarly, with two classes of QSAR model , the three-class models were evaluated via cross-validations, bootstrapping validations, and permutation tests for assessing the potential chance correlation. Moreover, prediction confidence analysis was also conducted to provide an additional interpretation of prediction results. These results indicated that the 3-class model showed higher accuracy in differentiating most-DILI drugs from no-DILI drugs than the 2-class DILI model with a potential to categorize DILI risk into a higher resolution.

3.4 Modified QSAR Models

Besides developing conventional QSAR models based on chemical structure information only, we also tried to incorporate other drug information, especially those related to DILI -relevant biological functions, to improve model performance . For instance, understanding the mode of action (MOA) of a drug is critical in safety assessment. Therefore, it is promising to improve the predictive model by considering MOA of drugs on DILI . To achieve that, we have developed an algorithm named MOA -DILI [69], integrating the MOA and structural information to enhance DILI prediction.

Different from a conventional QSAR model , the modified model will utilize MOA information to categorize drugs, i.e., drugs would be categorized into active or inactive group for each specific MOA . The underlying hypothesis is that MOA -specific drugs would share similar DILI mechanisms and thus would be predicted by the same QSAR models. In other words, we will develop one model to distinguish DILI drugs from all MOA active drugs and another model to separate DILI drugs from all MOA inactive drugs. Finally, these two QSAR models, for active and inactive drugs, respectively, were merged into one assay -specific QSAR model (Fig. 13.2a).

Fig. 13.2
figure 2

a Workflow for MOA -DILI modeling and b modeling performance of the MOA -DILI model

A total of 17 toxicity-relevant MOA assays was curated from the Tox21 dataset [70], including estrogen receptor (ER), androgen receptor (AR), mitochondrial toxicity, p53, PPAR gamma, etc. Therefore, 17 specific MOA -QSAR models were developed, and a consensus approach was applied to determine the DILI risk associated with drugs. Some feature selection strategies (i.e., sequential forward selection) were used to determine DILI -relevant MOAs (assays) for the final model .

The proposed MOA -DILI model was tested on 333 drugs with both clinical DILI annotation and Tox21 assay data available. Mold2 software [71] was used to generate chemical descriptors for the development of QSAR models. Hold-out and cross-validation were used to evaluate the model performance . For the hold-out approach, the 333 drugs were randomly split into 2/3 (222 drugs) and 1/3 (111 drugs). The former (2/3) were used to develop a model while the latter (1/3) were used to evaluate the model . The hold-out process was repeated 1000 times to generate training /test sets pairs. Cross-validation was applied inside the training set to evaluate model performance . Label permutation testing with the DILI severity annotations randomly shuffled was applied to check whether the model could generate results better than random.

The optimized MOA -DILI model employed four assays, i.e., ARE-bla (antioxidant response element), ER-luc-bg1-4e2-antagonist (ERalpha, BG1 cell line), gh3-tre-antagonist (thyroid receptor), and PPARG-bla-agonist (peroxisome proliferator-activated receptor gamma ). Furthermore, a prediction accuracy of 0.757 in 5-fold cross-validation and 0.695 in hold-out testing was observed for the optimized MOA -DILI model , which is significantly higher than the results obtained from the permutation test (Fig. 13.2b). Moreover, this optimized model has a significantly higher predictive performance than the conventional QSAR model only (Table 13.3), demonstrating the improved predictive power for hepatotoxicity by integrating MOA data of drugs.

Table 13.3 Overall Performance of AOPs-DILI model in training and test set

Another modified QSAR model was also developed, namely DILI prediction systems [72] which aims to translate the post-marketing surveillance information back to the preclinical stage for improving DILI prediction performance. In DILI prediction systems model , it is hypothesized that there exists a set of hepato-related side effects with discriminative power to distinguish between drugs with or without the risk for DILI . Then, in silico models could be developed for those hepato-related side effects based on drug’s chemical structure with machine learning algorithms . Based on SIDER datasets [73], 13 different hepato-related side effects were identified and corresponding models were developed by using naïve Bayesian classifier in a single cohesive prediction system. The DILI prediction systems yielded 60–70% accuracies when evaluated using drugs from different DILI annotations. Furthermore, it was found that when a drug was predicted as positive by at least three side effects, the positive predictive value could be boosted to 91%.

4 Conclusion

Reliably predicting the risk for DILI in humans is still an unmet need in the research community [34]. Accurate annotation of DILI risk is vital for the development of robust predictive models for prediction of DILI risk in humans; however, appropriate annotation is not a trivial task. We utilized the FDA-approved drug labels to annotate a given drug’s risk for DILI in humans, which was demonstrated to be robust and consistent across different types of drugs. The schema was further improved by weighing evidence of case reports and was applied to 1036 FDA -approved drugs to classified into three verified DILI groups (i.e., vMost-, vLess-, and vNo-DILI -concern) with an additional group of drugs with DILI concern but without verified causality (ambiguous annotation ).

Besides the improved DILI annotations, we could develop better models by utilizing the relevant contributing factors and advanced modeling technologies. We have developed a series of computational predictive models that use in silico or physicochemical methods, including the “rule-of-two ” model , DILI score model , conventional QSAR model for the prediction of two classes and multiple classes of DILI , and modified QSAR model including MOA -DILI model and DILI prediction systems model . Some models such as “rule-of-two ” were independently validated and successfully identified drugs with significant hepatotoxicity . In the future, some emerging technologies (e.g., high-throughput screening or high-content assay , induced Pluripotent Stem Cells (iPSCs), engineered human liver cocultures, and 3D cell culture) [74,75,76,77,78] could be incorporated into predictive models for a better identification of DILI risk liability at the early stage of drug development . In addition to the drug properties we discussed above, host factors and their interactions with drug properties [79, 80] should be considered and this information should be incorporated into current drug-based models to improve prediction of DILI .