Keywords

11.1 Introduction

Drug discovery normally starts with the discovery or diagnosis of a novel disease or pathogen that impair the quality of life. Consequently, researchers look for a desirable chemical (which could be a simple molecule or a complex protein) with a therapeutic effect that can benefit patients’ health and develop a new drug based on that valuable substance. A potential drug to develop on an industrial scale also requires limited severe and long-term side effects (Xia 2017), low possibility of drug resistance, affordability for patients and profitability for pharmaceutical companies (David et al. 2009; Drews and Ryser 1997) along with minor damage to the environment (Boxall et al. 2012).

The basic process of drug discovery pipeline includes target identification and study, hit discovery, hit to lead generation, lead optimization, candidate identification as well as preclinical and clinical trials (Zhong et al. 2018). Estimates recently suggest that in order to bring a novel prescription drug to the market, the mean expenditure before tax is approximately 3 billion USD (DiMasi et al. 2016) and it takes roughly 13 years (Paul et al. 2010). Nevertheless, only 13% of potential medicinal chemicals are estimated to be successfully approved after clinical phases, which shows a significantly high risk of failure (Zhong et al. 2018). Possible reasons for this low approval success rate includes unexpected toxicity, the inability to successfully compete in the market and most importantly, lack of clinical efficacy (Kola and Landis 2004). The facilitation of computer-based drug design methods is one of the most potential approaches to tackle this challenging situation (Baig et al. 2016).

Bioinformatics is an interdisciplinary science that includes proteomics, genomics, transcriptomics and molecular phylogenetics (Xia 2017). Bioinformatics facilitates drug discovery by using high-throughput molecular data to examine the difference between symptom-carriers such as between cell lines, animal models or patients and the controlled group (Xia 2017). Such comparison is aimed to (1) find the association between diseases and genetic and epigenetic factors as well as other environmental factors affecting gene expression, (2) screen drug targets related to cellular malfunction elimination or function improvement, (3) predict or modulate drug candidates in order to get the desirable outcome and minimize toxicities, (4) measure the influence on the environment and the possibility of drug resistance (Xia 2017).

Symptom-based bioinformatics in drug development depends on the disease types among infectious, genetic diseases and cancer (van Driel and Brunner 2006). Bioinformatics support drug discovery in genetic disorders mainly through identifying noninvasive tools for genetic diagnosis and prognosis (Wooller et al. 2017). For infectious diseases, this science examines the impact of bacterial or viral presence on gene expression and compares it with those of other pathogens or drug-induced results to explore new potentials of existing drugs (Wooller et al. 2017). Bioinformaticians can also identify the main genetic causes of cancer in individual patients and hence, personalize cancer treatment and facilitate the discovery of a novel drug or repurpose the existing drugs (Wooller et al. 2017; Zhang et al. 2009). Regarding drug screening, bioinformatics is able to benefit such processes by using high-throughput screening for library screening related to the drug target and for other secondary assays (Fox et al. 2006; Nemmani 2021). It also contributes to the early elimination of potential candidates with undesirable properties (Smith 2002a). In the next step of the drug discovery pipeline, bioinformatics software and platforms are applied in the process of drug validation. Pharmaceutical companies have gained a better understanding of how human genomes impact the efficacy of drug candidates thanks to bioinformatics tools and techniques (Chang 2005).

One of the earliest and most well-known contributions of bioinformatics to the pharmaceutical industry is the discovery of sequence homology between a platelet-derived growth factor (PDGF) and an oncogene named v-sis from sarcoma virus using simple string matching (Doolittle et al. 1983; Waterfield et al. 1983). This important finding has opened two novel lines of thinking in cancer biology. First, growth factors could be targets for anti-cancer drugs; for example, PDGF (Pietras et al. 2003). Second, cancer can be a final result of any regulatory factors of gene expression. We can say that this bioinformatics-induced finding led to a whole new conceptual framework that enhances the development of anti-cancer drug development (Moffat et al. 2014). That is just one outstanding example of how bioinformatics can facilitate the development of a novel drug. Therefore, this book chapter will focus on how bioinformatics supports the discovery of a potential drug, including the role of this novel area on the process of drug development, drug screening, drug validation and some notable achievements.

11.2 Bioinformatics in Drug Development

Drug development is the work of researching and finding suitable new drug molecules from the early stages to phase III clinical practice and the process of bringing drugs to market as well as testing afterward (Chen et al. 2021). The drug discovery process takes a long time of research and costs a lot of money to find a suitable one (Preziosi 2007); today, along with the development of science and technology, especially in the field of bioinformatics, the stages of drug development can be significantly shortened while the cost is reduced and the effectiveness of treatment increases (Chen et al. 2021; Moore and Allen 2019). Currently, the high-throughput screen method is a popular method for detecting potential small molecules among a large amount of information in available data libraries (McLean 2015). The molecules are then tested for their ability to bind to their target or work in vivo, and if appropriate, can be used as a starting point for drug testing in animals. In addition, bioinformatics also helps scientists study disease symptoms with genetic mutations, identify drugs capable of restoring or eliminating damaged cells, predict the effectiveness and side effects of drugs, as well as assessing drug resistance (Xia 2017).

With the vast amount of data available from gene libraries, reports on mutations or epigenetics, proteomic or biological processes, bioinformatics has greatly contributed to the discovery of potential drugs (Fig. 11.1). Through genome analysis, biologists and pharmacologists can find drugs capable of treating genetic diseases or pathogens. Bioinformatics tools are often used to screen the sequences of gene fragments, thereby uncovering potential binding sites for therapeutic drugs (Xia 2012). For example, bioinformatics research shows that potential LXR response elements (LXREs) regulate the human ADFP gene and they have great implications for the treatment of fatty liver (Kotokorpi et al. 2010). The study of the genomes of pathogens, such as bacteria, has revealed specific genetic sites of disease-causing species, and this is a huge target for the treatment of infections that limit the ability to drug resistance (Gal-Mor and Finlay 2006). Metabolic pathways in pathogenic microorganisms are also explored by bioinformatics, and drugs that target metabolic pathways in pathogens may be developed in the future (Bhatia et al. 2014). Bioinformatics can also reduce the cost of drug discovery by repurposing existing drugs to treat new pathogens (Ding et al. 2014). From the understanding of the genome, about the components of the surface structure of many pathogenic microorganisms, such as Galactofuranose—an important component of pathogenic bacteria but not found in humans, many studies provided potential drug development targets targeting such ingredients (Pedersen and Turco 2003). Bioinformatics also provides a great source of data on epigenetic changes, genes, metabolic processes and related substances, thereby helping to develop more effective drugs (Kanehisa 2013). Bioinformatics not only provides genetic data and mutations, but also provides a large amount of information about transcription, helping drug development through phenotypic screening to identify potential drug candidates and drug target determination (Xia 2017). The information on gene expression or metabolism patterns obtained from bioinformatics databases will play an important role in discovering drugs such as anti-cancer drugs or curing metabolic diseases (Wishart 2016; Xia et al. 2009). Based on the database, Li et al. calculated natural compounds with great potential in drug development against COVID-19 (Muhseen et al. 2021). A series of reports on the use of quantitative structure–activity relationship (QSAR), machine learning, and deep learning have yielded surprising results for the potential development of anti-aging drugs or the treatment of infections (Yeh et al. 2021; Araujo et al. 2020).

Fig. 11.1
The flow diagram of bioinformatics in drug development. Genomic, transcriptomic, and proteomic database. Mechanism of carcinogenesis or disease. Target protein and gene identification. Target validation. Novel drug discovery. Clinical test. F D A approved drug.

Bioinformatics in drug development

11.3 Bioinformatics in Drug Screening

Drug screening is the process of identifying and selecting drugs with great potential, safety, and efficacy before they go into clinical trials. This work needs to work with a large amount of information about the library of medicinal herbs and chemicals to be able to create the best medicine, so bioinformatics has a great application in this process (Table 11.1) (Nature 2023). After going through biochemical screening steps, potential compounds (“hit”) will continue to undergo tests to check that they have the appropriate physicochemical and pharmacological properties for development into drugs or not, if passed, it will be considered a “lead”. Before entering clinical trials, the “lead” will be chemically and biologically screened and eventually has the potential to develop into a drug. Since the early years of the twenty-first century, research has applied bioinformatics to screen targeted molecules, one of which is high-throughput screening. This model involves screening libraries close to the drug target and then secondary assays for the site of action or ability to function in the target protein (Fox et al. 2006; Nemmani 2021). Bioinformatics also has a huge contribution in virtual screening through the early elimination of substances with undesirable properties through computers and silico screening and thereby finding the closest compounds to the desired drug (Smith 2002b). With technological advancements and the ability to share data, virtual screening programs are exhibiting a higher percentage of “hit” screenings than in the past.

Table 11.1 Screening models (Hughes et al. 2011)

Bioinformatics has been applied in the screening and selection of potential drugs to treat diseases of unknown pathogenesis. By bioinformatics analysis, genes involved in Rheumatoid arthritis expression were discovered (Shi et al. 2020). Compounds with therapeutic potential for this disease were screened through disease-specific gene interaction (LILRB1), which resulted in the kaempferol 3-O-β-d-glucosyl-(1 molecule) molecule. →2)-β-d-glucoside can inhibit the pathological process of Rheumatoid arthritis. A 2019 study has shown the positive signals of bioinformatics application in the screening of potential compounds that help to proliferate cardiac muscle cells while ensuring physiological activity to heal damaged heart muscle tissue (Mills et al. 2019). This study shows that, from about 5000 compounds in the library, through the screening steps, the research has shown that two compounds have high applicability in myocardial proliferation and have the least side effects. The profiling relative inhibition simultaneously in mixtures (PRISM) method has been developed to increase the ability to test drugs, thereby uncovering potential compounds against cancer cell lines (Corsello et al. 2020). The study also showed unexpected results when drugs that do not treat cancer but also have the ability to inhibit cancer cell lines, allowing further research into the molecular characteristics of these cell lines and the direction of treatment. The development in recent years of bioinformatics has greatly contributed to the screening of drug molecules targeting RNA to fight cancer or infection (Manigrasso et al. 2021). The drug molecular structures are not only studied, calculated for pharmacological activity or virtual screening, but also stored in data libraries for in-depth studies and future prediction (Martin et al. 2021).

11.4 Bioinformatics in Drug Validation

According to FDA (U.S. Food and Drug Administration), drug validation can be understood as the process of collecting and evaluating the effectiveness of a drug from the time it is designed through the time it enters experiments and commercial production, thence, establish a system of reliable, scientific evidence for product quality (Center for Devices and Radiological Health and Center for Biologics Evaluation and Research 2002). With the strong development of science and technology today, the application of technological advances to testing the effectiveness of drugs is also of great interest (Hoffmann et al. 1998). Accordingly, bioinformatics software and platforms have been applied to determine the effectiveness of drug targeting genes to optimize the effects of disease therapies.

Bioinformatics can be said to have revolutionized the evaluation of drug efficacy through bioinformatics techniques and tools (Table 11.2). Based on these tools and techniques, drug development companies have been able to better understand how the human genome affects the effectiveness of therapeutic drugs (Chang 2005). In addition, also from the knowledge of the patient’s genome, personalized pharmacological therapies will be developed, and prescriptions will also be made to suit his or her drug metabolism. The application of bioinformatics to drug development, such as DNA microarray, has been developed to show the correlation between metabolic pathways and drug side effects, and also to evaluate new potential targets for treatment (Meloni et al. 2004). Working on the application of machine learning and synthesizing data on the relationship between genes and drugs, Wang and colleagues identified 96 drugs that target 10 target genes, which are biomarkers for atherosclerosis (Wang et al. 2022). Some of them have been found to be effective for stroke or atherosclerosis. Using the advantages of machine learning and data mining, pharmacologists can evaluate the pharmacological effects of drugs, make adjustments to the 3D structure or develop drug combination treatment strategies to achieve the best treatment effect with the fewest side effects (Agamah et al. 2020). Using genetic data and bioinformatics analysis, scientists have demonstrated that some drugs such as Echinacea, Omeprazole, Ibudilast are effective in treating periodontitis in type 2 diabetic animals, in which Echinacea and Ibudilasts deserve more research because of their amazing medicinal properties (Pan et al. 2022).

Table 11.2 Programs are used to determine the potability of a drug (Wooller et al. 2017)

11.5 Conclusion

In this chapter, we have presented the applications of bioinformatics for drug development, screening, and validation. Thereby, providing an overview of the achievements that bioinformatics has been used in the field of pharmacology. However, this report still has some limitations. Machine learning, deep learning models in drug response prediction are often assemblies of information that neglects the biological pathways underlying the prediction; therefore, they often have low predictive accuracy and require much fine-tuning by experts (Ching et al. 2018; Murdoch et al. 2019). In addition, bioinformatics-based predictions and analyzes are often still only models, and so they require clinical trials in animals and humans to draw the most accurate conclusions about safety and efficacy in real situations (Shi et al. 2020; Wang et al. 2022; Papillon-Cavanagh et al. 2013).

The development of science and technology is booming, bioinformatics technologies and software are increasingly perfected and have higher accuracy. As a result, the new era of personalized medicine will receive more research attention to personalize methods and prescriptions to treat diseases, in which bioinformatics will play an important role in helping pharmacists and doctors take advantage of the huge resources available (Bayat 2002). Furthermore, developments in bioinformatics have shown the ability to shorten the search time and cost of producing new drugs and utilize natural sources of medicinal herbs (Agamah et al. 2020; Tutone and Almerico 2021). Advances in biotechnology have opened up the understanding of the characteristics of oncogenes and the biomarkers to detect them, thereby developing potential treatment models or drugs-targeted genes (Nguyen and Caldas 2021). Bioinformatics also has enormous application opportunities in the development of software or models in predictive medicine, increasing the success rate of clinical trials (Kuenzi et al. 2020).

The pathogenesis of diseases of great interest such as cancer will be discovered through genomics, proteomics, and transcriptomics libraries. Accordingly, drug companies will identify the target gene or target protein to treat the disease based on the database of gene interactions, gene sequencing, and related articles (Thomford et al. 2018). After the process of selecting potential drug molecules based on bioinformatics tools, the interaction effect between the drug and the knock-out gene will be studied in vivo and in vitro to yield novel drug discovery results. This is followed by preclinical trials or drug efficacy models and then clinical trials to determine the actual safety and effectiveness of the drug in real situations.