Abstract
Drug discovery requires high cost and is a time-consuming process, and the facilitation of computer-based drug design methods is one of the most potential approaches to change this challenging situation. In fact, along with the current advancement of science and technology, especially in the field of bioinformatics, the stages of drug discovery can be significantly shortened while the cost is reduced and the efficacy of treatment increases. Bioinformatics tools and platforms can not only advance drug target identification and screening, but also support drug candidate selection and evaluate effectiveness of drug candidates. In recent years, bioinformatics tools have often been used to screen the sequences of gene fragments, uncovering potential binding sites for therapeutic drugs or also known as drug targets. Besides, the high-throughput screen method is a popular method for drug candidate identification for detecting potential small molecules among a large amount of information in available data libraries. Since the early years of the twenty-first century, research has applied bioinformatics to screen targeted molecules using the high-throughput screening model. Bioinformatics also has a huge contribution in virtual screening through the early elimination of substances with undesirable properties through computers and in silico screening, thereby finding the closest compounds to the desired drug. Based on these tools and techniques, the efficacy of drug candidates can be easily and quickly determined, especially in individuals, which revolutionarily benefits drug validation and personalized pharmacological therapies.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
11.1 Introduction
Drug discovery normally starts with the discovery or diagnosis of a novel disease or pathogen that impair the quality of life. Consequently, researchers look for a desirable chemical (which could be a simple molecule or a complex protein) with a therapeutic effect that can benefit patients’ health and develop a new drug based on that valuable substance. A potential drug to develop on an industrial scale also requires limited severe and long-term side effects (Xia 2017), low possibility of drug resistance, affordability for patients and profitability for pharmaceutical companies (David et al. 2009; Drews and Ryser 1997) along with minor damage to the environment (Boxall et al. 2012).
The basic process of drug discovery pipeline includes target identification and study, hit discovery, hit to lead generation, lead optimization, candidate identification as well as preclinical and clinical trials (Zhong et al. 2018). Estimates recently suggest that in order to bring a novel prescription drug to the market, the mean expenditure before tax is approximately 3 billion USD (DiMasi et al. 2016) and it takes roughly 13 years (Paul et al. 2010). Nevertheless, only 13% of potential medicinal chemicals are estimated to be successfully approved after clinical phases, which shows a significantly high risk of failure (Zhong et al. 2018). Possible reasons for this low approval success rate includes unexpected toxicity, the inability to successfully compete in the market and most importantly, lack of clinical efficacy (Kola and Landis 2004). The facilitation of computer-based drug design methods is one of the most potential approaches to tackle this challenging situation (Baig et al. 2016).
Bioinformatics is an interdisciplinary science that includes proteomics, genomics, transcriptomics and molecular phylogenetics (Xia 2017). Bioinformatics facilitates drug discovery by using high-throughput molecular data to examine the difference between symptom-carriers such as between cell lines, animal models or patients and the controlled group (Xia 2017). Such comparison is aimed to (1) find the association between diseases and genetic and epigenetic factors as well as other environmental factors affecting gene expression, (2) screen drug targets related to cellular malfunction elimination or function improvement, (3) predict or modulate drug candidates in order to get the desirable outcome and minimize toxicities, (4) measure the influence on the environment and the possibility of drug resistance (Xia 2017).
Symptom-based bioinformatics in drug development depends on the disease types among infectious, genetic diseases and cancer (van Driel and Brunner 2006). Bioinformatics support drug discovery in genetic disorders mainly through identifying noninvasive tools for genetic diagnosis and prognosis (Wooller et al. 2017). For infectious diseases, this science examines the impact of bacterial or viral presence on gene expression and compares it with those of other pathogens or drug-induced results to explore new potentials of existing drugs (Wooller et al. 2017). Bioinformaticians can also identify the main genetic causes of cancer in individual patients and hence, personalize cancer treatment and facilitate the discovery of a novel drug or repurpose the existing drugs (Wooller et al. 2017; Zhang et al. 2009). Regarding drug screening, bioinformatics is able to benefit such processes by using high-throughput screening for library screening related to the drug target and for other secondary assays (Fox et al. 2006; Nemmani 2021). It also contributes to the early elimination of potential candidates with undesirable properties (Smith 2002a). In the next step of the drug discovery pipeline, bioinformatics software and platforms are applied in the process of drug validation. Pharmaceutical companies have gained a better understanding of how human genomes impact the efficacy of drug candidates thanks to bioinformatics tools and techniques (Chang 2005).
One of the earliest and most well-known contributions of bioinformatics to the pharmaceutical industry is the discovery of sequence homology between a platelet-derived growth factor (PDGF) and an oncogene named v-sis from sarcoma virus using simple string matching (Doolittle et al. 1983; Waterfield et al. 1983). This important finding has opened two novel lines of thinking in cancer biology. First, growth factors could be targets for anti-cancer drugs; for example, PDGF (Pietras et al. 2003). Second, cancer can be a final result of any regulatory factors of gene expression. We can say that this bioinformatics-induced finding led to a whole new conceptual framework that enhances the development of anti-cancer drug development (Moffat et al. 2014). That is just one outstanding example of how bioinformatics can facilitate the development of a novel drug. Therefore, this book chapter will focus on how bioinformatics supports the discovery of a potential drug, including the role of this novel area on the process of drug development, drug screening, drug validation and some notable achievements.
11.2 Bioinformatics in Drug Development
Drug development is the work of researching and finding suitable new drug molecules from the early stages to phase III clinical practice and the process of bringing drugs to market as well as testing afterward (Chen et al. 2021). The drug discovery process takes a long time of research and costs a lot of money to find a suitable one (Preziosi 2007); today, along with the development of science and technology, especially in the field of bioinformatics, the stages of drug development can be significantly shortened while the cost is reduced and the effectiveness of treatment increases (Chen et al. 2021; Moore and Allen 2019). Currently, the high-throughput screen method is a popular method for detecting potential small molecules among a large amount of information in available data libraries (McLean 2015). The molecules are then tested for their ability to bind to their target or work in vivo, and if appropriate, can be used as a starting point for drug testing in animals. In addition, bioinformatics also helps scientists study disease symptoms with genetic mutations, identify drugs capable of restoring or eliminating damaged cells, predict the effectiveness and side effects of drugs, as well as assessing drug resistance (Xia 2017).
With the vast amount of data available from gene libraries, reports on mutations or epigenetics, proteomic or biological processes, bioinformatics has greatly contributed to the discovery of potential drugs (Fig. 11.1). Through genome analysis, biologists and pharmacologists can find drugs capable of treating genetic diseases or pathogens. Bioinformatics tools are often used to screen the sequences of gene fragments, thereby uncovering potential binding sites for therapeutic drugs (Xia 2012). For example, bioinformatics research shows that potential LXR response elements (LXREs) regulate the human ADFP gene and they have great implications for the treatment of fatty liver (Kotokorpi et al. 2010). The study of the genomes of pathogens, such as bacteria, has revealed specific genetic sites of disease-causing species, and this is a huge target for the treatment of infections that limit the ability to drug resistance (Gal-Mor and Finlay 2006). Metabolic pathways in pathogenic microorganisms are also explored by bioinformatics, and drugs that target metabolic pathways in pathogens may be developed in the future (Bhatia et al. 2014). Bioinformatics can also reduce the cost of drug discovery by repurposing existing drugs to treat new pathogens (Ding et al. 2014). From the understanding of the genome, about the components of the surface structure of many pathogenic microorganisms, such as Galactofuranose—an important component of pathogenic bacteria but not found in humans, many studies provided potential drug development targets targeting such ingredients (Pedersen and Turco 2003). Bioinformatics also provides a great source of data on epigenetic changes, genes, metabolic processes and related substances, thereby helping to develop more effective drugs (Kanehisa 2013). Bioinformatics not only provides genetic data and mutations, but also provides a large amount of information about transcription, helping drug development through phenotypic screening to identify potential drug candidates and drug target determination (Xia 2017). The information on gene expression or metabolism patterns obtained from bioinformatics databases will play an important role in discovering drugs such as anti-cancer drugs or curing metabolic diseases (Wishart 2016; Xia et al. 2009). Based on the database, Li et al. calculated natural compounds with great potential in drug development against COVID-19 (Muhseen et al. 2021). A series of reports on the use of quantitative structure–activity relationship (QSAR), machine learning, and deep learning have yielded surprising results for the potential development of anti-aging drugs or the treatment of infections (Yeh et al. 2021; Araujo et al. 2020).
11.3 Bioinformatics in Drug Screening
Drug screening is the process of identifying and selecting drugs with great potential, safety, and efficacy before they go into clinical trials. This work needs to work with a large amount of information about the library of medicinal herbs and chemicals to be able to create the best medicine, so bioinformatics has a great application in this process (Table 11.1) (Nature 2023). After going through biochemical screening steps, potential compounds (“hit”) will continue to undergo tests to check that they have the appropriate physicochemical and pharmacological properties for development into drugs or not, if passed, it will be considered a “lead”. Before entering clinical trials, the “lead” will be chemically and biologically screened and eventually has the potential to develop into a drug. Since the early years of the twenty-first century, research has applied bioinformatics to screen targeted molecules, one of which is high-throughput screening. This model involves screening libraries close to the drug target and then secondary assays for the site of action or ability to function in the target protein (Fox et al. 2006; Nemmani 2021). Bioinformatics also has a huge contribution in virtual screening through the early elimination of substances with undesirable properties through computers and silico screening and thereby finding the closest compounds to the desired drug (Smith 2002b). With technological advancements and the ability to share data, virtual screening programs are exhibiting a higher percentage of “hit” screenings than in the past.
Bioinformatics has been applied in the screening and selection of potential drugs to treat diseases of unknown pathogenesis. By bioinformatics analysis, genes involved in Rheumatoid arthritis expression were discovered (Shi et al. 2020). Compounds with therapeutic potential for this disease were screened through disease-specific gene interaction (LILRB1), which resulted in the kaempferol 3-O-β-d-glucosyl-(1 molecule) molecule. →2)-β-d-glucoside can inhibit the pathological process of Rheumatoid arthritis. A 2019 study has shown the positive signals of bioinformatics application in the screening of potential compounds that help to proliferate cardiac muscle cells while ensuring physiological activity to heal damaged heart muscle tissue (Mills et al. 2019). This study shows that, from about 5000 compounds in the library, through the screening steps, the research has shown that two compounds have high applicability in myocardial proliferation and have the least side effects. The profiling relative inhibition simultaneously in mixtures (PRISM) method has been developed to increase the ability to test drugs, thereby uncovering potential compounds against cancer cell lines (Corsello et al. 2020). The study also showed unexpected results when drugs that do not treat cancer but also have the ability to inhibit cancer cell lines, allowing further research into the molecular characteristics of these cell lines and the direction of treatment. The development in recent years of bioinformatics has greatly contributed to the screening of drug molecules targeting RNA to fight cancer or infection (Manigrasso et al. 2021). The drug molecular structures are not only studied, calculated for pharmacological activity or virtual screening, but also stored in data libraries for in-depth studies and future prediction (Martin et al. 2021).
11.4 Bioinformatics in Drug Validation
According to FDA (U.S. Food and Drug Administration), drug validation can be understood as the process of collecting and evaluating the effectiveness of a drug from the time it is designed through the time it enters experiments and commercial production, thence, establish a system of reliable, scientific evidence for product quality (Center for Devices and Radiological Health and Center for Biologics Evaluation and Research 2002). With the strong development of science and technology today, the application of technological advances to testing the effectiveness of drugs is also of great interest (Hoffmann et al. 1998). Accordingly, bioinformatics software and platforms have been applied to determine the effectiveness of drug targeting genes to optimize the effects of disease therapies.
Bioinformatics can be said to have revolutionized the evaluation of drug efficacy through bioinformatics techniques and tools (Table 11.2). Based on these tools and techniques, drug development companies have been able to better understand how the human genome affects the effectiveness of therapeutic drugs (Chang 2005). In addition, also from the knowledge of the patient’s genome, personalized pharmacological therapies will be developed, and prescriptions will also be made to suit his or her drug metabolism. The application of bioinformatics to drug development, such as DNA microarray, has been developed to show the correlation between metabolic pathways and drug side effects, and also to evaluate new potential targets for treatment (Meloni et al. 2004). Working on the application of machine learning and synthesizing data on the relationship between genes and drugs, Wang and colleagues identified 96 drugs that target 10 target genes, which are biomarkers for atherosclerosis (Wang et al. 2022). Some of them have been found to be effective for stroke or atherosclerosis. Using the advantages of machine learning and data mining, pharmacologists can evaluate the pharmacological effects of drugs, make adjustments to the 3D structure or develop drug combination treatment strategies to achieve the best treatment effect with the fewest side effects (Agamah et al. 2020). Using genetic data and bioinformatics analysis, scientists have demonstrated that some drugs such as Echinacea, Omeprazole, Ibudilast are effective in treating periodontitis in type 2 diabetic animals, in which Echinacea and Ibudilasts deserve more research because of their amazing medicinal properties (Pan et al. 2022).
11.5 Conclusion
In this chapter, we have presented the applications of bioinformatics for drug development, screening, and validation. Thereby, providing an overview of the achievements that bioinformatics has been used in the field of pharmacology. However, this report still has some limitations. Machine learning, deep learning models in drug response prediction are often assemblies of information that neglects the biological pathways underlying the prediction; therefore, they often have low predictive accuracy and require much fine-tuning by experts (Ching et al. 2018; Murdoch et al. 2019). In addition, bioinformatics-based predictions and analyzes are often still only models, and so they require clinical trials in animals and humans to draw the most accurate conclusions about safety and efficacy in real situations (Shi et al. 2020; Wang et al. 2022; Papillon-Cavanagh et al. 2013).
The development of science and technology is booming, bioinformatics technologies and software are increasingly perfected and have higher accuracy. As a result, the new era of personalized medicine will receive more research attention to personalize methods and prescriptions to treat diseases, in which bioinformatics will play an important role in helping pharmacists and doctors take advantage of the huge resources available (Bayat 2002). Furthermore, developments in bioinformatics have shown the ability to shorten the search time and cost of producing new drugs and utilize natural sources of medicinal herbs (Agamah et al. 2020; Tutone and Almerico 2021). Advances in biotechnology have opened up the understanding of the characteristics of oncogenes and the biomarkers to detect them, thereby developing potential treatment models or drugs-targeted genes (Nguyen and Caldas 2021). Bioinformatics also has enormous application opportunities in the development of software or models in predictive medicine, increasing the success rate of clinical trials (Kuenzi et al. 2020).
The pathogenesis of diseases of great interest such as cancer will be discovered through genomics, proteomics, and transcriptomics libraries. Accordingly, drug companies will identify the target gene or target protein to treat the disease based on the database of gene interactions, gene sequencing, and related articles (Thomford et al. 2018). After the process of selecting potential drug molecules based on bioinformatics tools, the interaction effect between the drug and the knock-out gene will be studied in vivo and in vitro to yield novel drug discovery results. This is followed by preclinical trials or drug efficacy models and then clinical trials to determine the actual safety and effectiveness of the drug in real situations.
References
Agamah FE et al (2020) Computational/in silico methods in drug target and lead prediction. Brief Bioinform 21(5):1663–1675
Araujo PHF et al (2020) Identification of potential COX-2 inhibitors for the treatment of inflammatory diseases using molecular modeling approaches. Molecules 25(18):4183
Baig MH et al (2016) Computer aided drug design: success and limitations. Curr Pharm Des 22(5):572–581
Bayat A (2002) Science, medicine, and the future: bioinformatics. BMJ 324(7344):1018–1022
Bhatia B et al (2014) Identification of glutamate ABC-transporter component in Clostridium perfringens as a putative drug target. Bioinformation 10(7):401–405
Boxall AB et al (2012) Pharmaceuticals and personal care products in the environment: what are the big questions? Environ Health Perspect 120(9):1221–1229
Center for Devices and Radiological Health and Center for Biologics Evaluation and Research (2002) General principles of software validation. FDA-1997-D-0029. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/general-principles-software-validation
Chang PL (2005) Clinical bioinformatics. Chang Gung Med J 28(4):201–211
Chen Z et al (2021) Applications of artificial intelligence in drug development using real-world data. Drug Discov Today 26(5):1256–1264
Ching T et al (2018) Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface 15(141):20170387
Corsello SM et al (2020) Discovering the anti-cancer potential of non-oncology drugs by systematic viability profiling. Nat Cancer 1(2):235–248
David E, Tramontin T, Zemmel R (2009) Pharmaceutical R&D: the road to positive returns. Nat Rev Drug Discov 8(8):609–610
DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ 47:20–33
Ding H et al (2014) Similarity-based machine learning methods for predicting drug–target interactions: a brief review. Brief Bioinform 15(5):734–747
Doolittle RF et al (1983) Simian sarcoma virus onc gene, v-sis, is derived from the gene (or genes) encoding a platelet-derived growth factor. Science 221(4607):275–277
Drews J, Ryser S (1997) The role of innovation in drug development. Nat Biotechnol 15(13):1318–1319
Fox S et al (2006) High-throughput screening: update on practices and success. J Biomol Screen 11(7):864–869
Gal-Mor O, Finlay BB (2006) Pathogenicity islands: a molecular toolbox for bacterial virulence. Cell Microbiol 8(11):1707–1719
Hoffmann A et al (1998) Computer system validation: an overview of official requirements and standards. Pharm Acta Helv 72(6):317–325
Hughes JP et al (2011) Principles of early drug discovery. Br J Pharmacol 162(6):1239–1249
Kanehisa M (2013) Molecular network analysis of diseases and drugs in KEGG. Methods Mol Biol 939:263–275
Kola I, Landis J (2004) Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov 3(8):711–715
Kotokorpi P et al (2010) The human ADFP gene is a direct liver-X-receptor (LXR) target gene and differentially regulated by synthetic LXR ligands. Mol Pharmacol 77(1):79–86
Kuenzi BM et al (2020) Predicting drug response and synergy using a deep learning model of human cancer cells. Cancer Cell 38(5):672–684 e6
Manigrasso J, Marcia M, De Vivo M (2021) Computer-aided design of RNA-targeted small molecules: a growing need in drug discovery. Chem 7(11):2965–2988
Martin WJ, Grandi P, Marcia M (2021) Screening strategies for identifying RNA- and ribonucleoprotein-targeted compounds. Trends Pharmacol Sci 42(9):758–771
McLean L (2015) 49—Drug development. In: Hochberg MC et al (eds) Rheumatology, 6th edn. Mosby, Philadelphia, pp 395–400
Meloni R, Khalfallah O, Biguet NF (2004) DNA microarrays and pharmacogenomics. Pharmacol Res 49(4):303–308
Mills RJ et al (2019) Drug screening in human PSC-cardiac organoids identifies pro-proliferative compounds acting via the mevalonate pathway. Cell Stem Cell 24(6):895–907.e6
Moffat JG, Rudolph J, Bailey D (2014) Phenotypic screening in cancer drug discovery—past, present and future. Nat Rev Drug Discov 13(8):588–602
Moore H, Allen R (2019) What can mathematics do for drug development? Bull Math Biol 81(9):3421–3424
Muhseen ZT et al (2021) Computational determination of potential multiprotein targeting natural compounds for rational drug design against SARS-COV-2. Molecules 26(3):674
Murdoch WJ et al (2019) Definitions, methods, and applications in interpretable machine learning. Proc Natl Acad Sci U S A 116(44):22071–22080
Nature (2023) Drug screening articles from across Nature Portfolio. Nature
Nemmani KVS (2021) Pharmacological screening: drug discovery. In: Poduri R (ed) Drug discovery and development: from targets and molecules to medicines. Springer Singapore, Singapore, pp 211–233
Nguyen LV, Caldas C (2021) Functional genomics approaches to improve pre-clinical drug screening and biomarker discovery. EMBO Mol Med 13(9):e13189
Pan S et al (2022) Identification of cross-talk pathways and ferroptosis-related genes in periodontitis and type 2 diabetes mellitus by bioinformatics analysis and experimental validation. Front Immunol 13:1015491
Papillon-Cavanagh S et al (2013) Comparison and validation of genomic predictors for anticancer drug sensitivity. J Am Med Inform Assoc 20(4):597–602
Paul SM et al (2010) How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nat Rev Drug Discov 9(3):203–214
Pedersen LL, Turco SJ (2003) Galactofuranose metabolism: a potential target for antimicrobial chemotherapy. Cell Mol Life Sci 60(2):259–266
Pietras K et al (2003) PDGF receptors as cancer drug targets. Cancer Cell 3(5):439–443
Preziosi P (2007) 2.06—Drug development. In: Taylor JB, Triggle DJ (eds) Comprehensive medicinal chemistry II. Elsevier, Oxford, pp 173–202
Shi YQ, Qi WF, Kong CY (2020) Drug screening and identification of key candidate genes and pathways of rheumatoid arthritis. Mol Med Rep 22(2):986–996
Smith A (2002a) Screening for drug discovery: the leading question. Nature 418(6896):453–459
Smith A (2002b) Screening for drug discovery: the leading question. Nature 418(6896):453–455
Thomford NE et al (2018) Natural products for drug discovery in the 21st century: innovations for novel drug discovery. Int J Mol Sci 19(6):1578
Tutone M, Almerico AM (2021) Computational approaches: drug discovery and design in medicinal chemistry and bioinformatics. Molecules 26(24):7500
van Driel MA, Brunner HG (2006) Bioinformatics methods for identifying candidate disease genes. Hum Genom 2(6):429–432
Wang J et al (2022) Identification of immune cell infiltration and diagnostic biomarkers in unstable atherosclerotic plaques by integrated bioinformatics analysis and machine learning. Front Immunol 13:956078
Waterfield MD et al (1983) Platelet-derived growth factor is structurally related to the putative transforming protein p28sis of simian sarcoma virus. Nature 304(5921):35–39
Wishart DS (2016) Introduction to cheminformatics. Curr Protoc Bioinform 53(1):14.1.1–14.1.21
Wooller SK et al (2017) Bioinformatics in translational drug discovery. Biosci Rep 37(4):BSR20160180
Xia X (2012) Position weight matrix, Gibbs sampler, and the associated significance tests in motif characterization and prediction. Scientifica (Cairo) 2012:917540
Xia X (2017) Bioinformatics and drug discovery. Curr Top Med Chem 17(15):1709–1726
Xia J et al (2009) MetaboAnalyst: a web server for metabolomic data analysis and interpretation. Nucleic Acids Res 37(Web Server issue):W652–W660
Yeh SJ, Lin JF, Chen BS (2021) Multiple-molecule drug design based on systems biology approaches and deep neural network to mitigate human skin aging. Molecules 26(11):3178
Zhang J, Yang PL, Gray NS (2009) Targeting cancer with small molecule kinase inhibitors. Nat Rev Cancer 9(1):28–39
Zhong F et al (2018) Artificial intelligence in drug design. Sci China Life Sci 61(10):1191–1204
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Dao, N.A., Vu, TD., Chu, DT. (2024). Bioinformatics in Drug Discovery. In: Singh, V., Kumar, A. (eds) Advances in Bioinformatics. Springer, Singapore. https://doi.org/10.1007/978-981-99-8401-5_11
Download citation
DOI: https://doi.org/10.1007/978-981-99-8401-5_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8400-8
Online ISBN: 978-981-99-8401-5
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)