Keywords

1 Introduction

Bacterial diseases, such as tuberculosis, pneumonia, cholera, diphtheria, meningitis, tetanus, Lyme disease, gonorrhea, and syphilis, are a leading global health threat and represent a major cause of morbidity and mortality. The disease pathologies are largely caused by the production of toxins or by an aggressive immune response to bacterial antigens. Although improved awareness about sanitation, advancements in vaccination, and the discovery of antibiotics have greatly minimized the impact of these diseases, many hurdles yet remain. The emergence of drug-resistant bacterial strains and the subsequent resurgence of diseases such as tuberculosis have reestablished infection as a prominent global threat.

Despite the rising problem of drug resistance, no effective alternative therapeutics has emerged to protect against these deadly bacterial infections. According to the World Health Organization (WHO), approximately 700,000 people die every year due to drug-resistant infections worldwide (de Kraker et al. 2016), with those numbers estimated to reach ten million by 2050. Currently available approaches to drug discovery are very expensive and time consuming, and thus there is a great need to develop new and advanced perspectives to overcome this problem. Advancements in computational methods are currently playing a key role in scientific research, not only in the discovery of novel drug targets but in regard to many other fields, such as genomics, proteomics, metabolomics, transcriptomics, systems biology, and molecular phylogenetics. These methods allow scientists to link disease symptoms to particular mutations, epigenetic modifications, and various other genetic and environmental alterations, thereby identifying potential drug targets and novel therapeutics with fewer side effects and potential for drug resistance (i.e., minimizing long-term adverse effects on human health while improving cost- and time-effectiveness). Thus, the current chapter focuses on the significant role and real-world applicability of computational approaches with respect to antibacterial drug discovery.

2 Genomic Approaches to Drug Discovery

Genomics approaches are regularly used for the detailed study of an organism’s genetic code and are applied in the fields of DNA sequencing and recombinant DNA technology, and in understating the assembly, annotation, and interpretation of the structure and function of a genome (Mishra and Srivastava 2017). The rise of the genomics era has played a significant role in vaccine and drug development from sequence-based approaches and has provided a novel path to the investigation of underlying disease mechanisms. Genomic study is well suited for the identification of potential drug targets and the design of novel therapeutics and vaccines, as well as the prediction of their side effects on human health, and represents a promising approach to combatting drug resistance and disease resurgence.

Identification of potential antibacterial drug targets based on genome sequences is a major challenge, as the number of genes with unknown biological function is still high. The emergence of bioinformatics has played a crucial role in the identification of homologs of known genes by comparative genomic analysis of new sequences with biochemically characterized sequences of proteins/enzymes. The complete genome sequencing of bacteria provides insight into new information about the disease, its pathogenesis, resurgence, and drug resistance. Thus, genomics is used to predict potential antibacterial drug targets for disease. After the completion of the first whole genome bacterial sequence of Haemophilus influenza in 1995, computational approaches have been vital in providing significant information about pathogens, pathogenesis, and antibiotic resistance. To date, thousands of bacterial genomes have been sequenced and many more are presently ongoing. The genome sequence of a particular bacterial pathogen can contain thousands of genes, providing a massive collection of potential antigens and drug targets. Thus, genomics can be used to identify potential drug candidates faster and more accurately than conventional methods. Genomic approaches are not only used in target-based screening studies but also in whole genome expression profiles to study the cellular response to therapeutic manipulation of an antimicrobial drug target. Additionally, genomics can be used in screening large numbers of herbal compounds possessing antibacterial activity and can identify novel structural classes of antibiotics. Thus, the genomics era has had a significant influence on vaccine and therapeutic development.

2.1 Reverse Vaccinology

Reverse vaccinology is defined as a genome-based approach to vaccine development. It uses computation to design novel vaccines by taking information present in the genome without the need to grow specific microorganisms in the laboratory. The first vaccine developed by reverse vaccinology was the meningococcal B (MenB) vaccine for the prevention of Neisseria meningitides (R. Rappuoli et al. 2012). In this study, the complete genome of MenB was sequenced and computationally analyzed. The selected in silico vaccine candidate was then expressed in Escherichia coli to allow for in vitro testing. This work represented a significant contribution to the field of vaccine development. Since then, the approach has been successfully applied in the development of vaccines against various organisms, including Bacillus anthracis, Streptococcus pneumonia, Staphylococcus aureus, Chlamydia pneumoniae, Porphyromonas gingivalis, Edwardsiella tarda, and Mycobacterium tuberculosis (Motin and Torres 2009). The approach is also being used in the development of protein-based vaccines against antibiotic-resistant Staphylococcus aureus and Streptococcus pneumoniae, and to interpret transcriptomic and proteomic data in order to generate a short list of candidate antigens that can be used for in vitro analysis, thereby reducing the cost and time of downstream processing (Del Tordello et al. 2017). Thus, reverse vaccinology represents a revolutionary new vaccine development strategy that greatly improves upon conventional methods.

2.2 Next-Generation Sequencing

Present approaches for the identification of human pathogen are not sufficient to provide complete information related to disease pathogenesis. Next-generation sequencing (NGS) identifies the complete genome of an organism in a single sequence. The results interpreted from these data provide, in detail, the underlying mechanisms of disease virulence and resistance, provides information on disease outbreak, and is being used to monitor the current and historic emergence of drug resistance in bacteria and other microbes. NGS technologies are used in various medical microbiology laboratories for the identification and characterization of causal pathogens, rapid identification of bacteria using the 16S–23S rRNA region, and in taxonomic and metagenomic approaches to the study of infectious disease. It is specifically used in the study of evolution and dynamics of drug resistance in bacterial pathogens (ECDC 2016). Various NGS software are available for analysis, including CLC Genomic Workbench (Qiagen) (Powell 2018), SPAdes (Lapidus et al. 2014), and Velvet (Zerbino 2010) for genome assembly analysis, multi-locus sequencing typing (Belén et al. 2009) (MLST) approaches, and conserved core genome (Ghanem et al. 2018) (cgMLST), or whole genome (wgMLST), for investigating genetic relationships (Chen et al. 2017). Many others, such as SeqSphere (Ridom) (Kohl 2014) and BioNumerics (Applied Maths, Biomérieux) (Hunter et al. 2005), or online tools, such as EnteroBase and BIGSdb (Bacterial Isolate Genome Sequence Database) (Jolley and Maiden 2010), are also available.

3 Proteomic Approaches to Drug Discovery

Proteomics is a science dealing with the global analysis of cellular proteins (Osman et al. 2009). It is defined as a complete set of proteins that are produced by an organism under certain conditions (Wasinger et al. 1995). In recent years, proteomics has become a powerful tool for the analysis of complex biochemical mechanisms, and identification of new protein structure, function, and protein–protein interactions. In addition, investigating protein profiles in response to antibiotic sensitivity and drug resistance can significantly contribute to the development of therapeutics for disease recurrence. Furthermore, with the emergence of bioinformatics it has become easier to retrieve information about specific genomes and proteomes for in-depth analysis. In short, proteomics has a diverse range of applications, including drug target identification and validation, efficacy and toxicity testing, and the investigation of drug mechanisms and activities.

3.1 Molecular Modeling

Protein 3D structure is an important perspective for structure-based drug design, as protein structure is vital to ligand binding. With the rapid emergence of homology modeling, it has fast become the first choice for protein 3D structure prediction (Srivastava and Tiwari 2017). Homology modeling techniques for structure prediction are based on sequence similarity to a homologous structure. It is an easy, reliable, low-cost, and less-time-consuming method than conventional means. The 3D structures of proteins provide valuable information about the underlying mechanisms and functions of the molecules, which plays a significant role in drug discovery. Presently, over 137,000 experimental protein structures are available in the Protein Data Bank (PDB) (19).

Homology modeling involves several steps, including template identification, multiple sequence alignment, and model building based on the 3D structure of a template, as well as model refinement, optimization, and validation. Various studies have been conducted that reveal the significance of homology modeling in drug discovery. In one such study, the 3D structure of the N315 stp1 protein produced by a clinical strain of Staphylococcus aureus was predicted using homology modeling through modeler software (Jain et al. 2014). The predicted structure was then validated through PROCHECK, ERRAT, VERIFY-3D, and ProSA tools. Further, the structure was refined through GROMACS software to obtain a more stable and refined configuration. Thus, the predicted 3D structure of Staphylococcus aureus N315 stp1 provided valuable information about structure–activity relationships and interactions with the protein. Another study found that drug resistance in Mycobacterium tuberculosis is due to a multidrug efflux mechanism of the Mycobacterium multidrug resistant (MMR) protein (Malkhed et al. 2013), which belongs to small multidrug-resistant family of proteins (SMR). Thus, considering the MMR protein as a novel target, an in silico tertiary structure of protein was designed and constructed in order to identify a novel drug molecule for drug resistance in Mycobacterium tuberculosis. Another study on a virulent strain of CGSP 14 Streptococcus pneumoniae (Karavadi et al. 2014a, b) revealed that the genome codes for 2206 proteins, among which the polysaccharide polymerase protein (B2ILP9) and capsular polysaccharide biosynthesis protein (B2ILP4) act as efficient drug targets and were modeled through homology in order to discover the role of the proteins in the disease pathway of pneumonia. Hence, homology modeling plays a significant role in providing information about protein structure and function in less time, and in a cost-effective manner than conventional means, representing a boon for drug design and development. Various tools for molecular modeling are available, such as Modeller, I-TASSER, LOMETS, SWISS-MODEL, and Gromacs (Frantisek et al. 2007; Zhou et al. 2015; Chou 2004).

3.2 Virtual Screening and Molecular Docking

Virtual screening is defined as the process of screening the molecules from a library of chemical compounds based on their scoring and binding affinity to the specific target (Chen 1977). It is performed for the screening of most potential drug candidates based on their chemical properties, the Lipinski rule of five (Lipinski et al. 2001), their ADMET (Cheng et al. 2013) properties (absorption, distribution, metabolism, excretion, and toxicity), their interaction with the specific target, and their binding affinity as determined by a molecular docking study.

These methods have been used to screen a host of potential therapeutics. For example, in a study involving abnormal prion protein (PrPSc), which is responsible for the pathogenesis of prion diseases, structure-based drug discovery approaches were used to filter out the most promising compound with anti-prion effects, showing high binding score and spatial interaction with the target, and remarkably reducing prion disease pathogenesis (Ishibashi et al. 2016). In another study, proteins identified from highly virulent strains of Streptococcus pneumonia (HUNGARY19A-6, D39, TIGR4, G54, CGSP14, TCH8431-19A) were modeled and validated (Nastasa 2018). Structure-based virtual screening was then performed to identify novel drug molecules against the proteins, and a docking study was utilized to analyze the protein–ligand interactions and binding affinity. Peptide deformylase protein (PDF), which is essential to the pathogenesis of several bacterial diseases, is used as an attractive target to identify novel antibacterial drug molecules based on binding affinity with hydrazine derivatives using virtual screening and molecular docking study (Karavadi et al. 2014a, b). Bacterial topoisomerase (khursheed 2013), a key target in antibacterial and anticancer drug discovery, was used to discover potential bacterial topoisomerase I inhibitors and structural motif using in silico screening. Various online and offline tools and software, such as Pharmer, Catalyst, PharmaGist, Blaster, Anchor Query, Ligandscout, Autodock, Swiss-Dock, and GOLD, are available for study and analysis (Sandhaus et al. 2018; Peter 2010; Kujawski et al. 2012; Jones and Rowland 2013; Parrott et al. 2014; Dubey et al. 2011).

3.3 Molecular Dynamic Simulation

Molecular dynamic (MD) simulation plays a very significant role in the drug discovery process. Through MD simulation, one can track the rapid processes of biological systems that can occur in less than a millisecond. It is used to study the physical movement of all the macromolecules, proteins, nucleic acids, atoms, and carbohydrates of biological significance (Cumming et al. 2013). Calculating the free binding energy of ligand–protein and protein–protein interactions is an important feature of the simulations.

Structure-based virtual screening with MD simulation and free energy calculation was used to study the activity of anon–peptide compound against falcipain 1 and 2 (FP-1 and FP-2), Plasmodium-produced proteins that catalyze hemoglobin degradation, and their analogs (D W. Borhani 2012). A South African natural compound, 5PGA, and five further potential compounds were identified as having inhibitory activity against FP-1, FP-2, and their analogs. In another study utilizing MD simulation of two E. coli–produced Resistance-Nodulation-Division (RND) transporters, AcrB and AcrD, which play a major role in multidrug resistance, researchers connected various specificity patterns of the two transporters to their physicochemical and topographical properties based on calculation of multifunctional recognition sites on the molecular surface (Musyoka et al. 2016). Another molecular docking and MD simulation study was performed on meropenem and imipenem, two antibiotics having different binding affinity for the efflux pump in P. aeruginosa and AcrB structures, revealed a greater susceptibility of meropenem over imipenem to the binding site of AcrB, and in-depth analysis identified a key residue involved in the binding interaction (Ramaswamy et al. 2017). These examples illustrate how MD simulation is a very beneficial tool for structure-based drug design. Various MD simulation software is available, such as Amber, CHARMm, Gromacs, NAMD, and Schrodinger’s Desmond (Bajic et al. 2016; Salomon-Ferrer et al. 2013; Brooks 2009; Phillips et al. 2005).

3.4 Toxicity Prediction

Determining the safety and toxicity of chemical compounds represents a crucial step in the drug discovery process. In silico toxicology prediction is a computational method used to visualize, analyze, simulate, and predict the toxicity of a chemical (David E. Shaw 2006). The aim of toxicity testing is to identify the effects of harmful chemicals on both the patient and the environment. Various parameters are involved in toxicity testing, such as the rate, frequency and dosages of exposure, ADMET (absorption, distribution, metabolism, excretion/elimination, and toxicity) properties, as well as other biological and chemical properties. In silico toxicology testing using computational approaches minimizes the use of animals while also reducing the cost and time requirements. Along with this, it also improves the safety and assessment of the chemicals. Various tools and software are available for computational toxicity testing, including OSIRIS Property Explorer, ALOGPS, ADMET Prediction, Molinspiration, and TOPKAT (Parthasarathi and Dhawan 2018; Dearden 2003; Tetko and Bruneau 2004; Ekins et al. 2017; Nadeem et al. 2015).

4 Antibacterial/Antimicrobial Databases

Biological databases are the most important feature of bioinformatics. They contain vast libraries of biological information, including genes, proteins, metabolic pathways, microarray data, next-generation sequencing data, and much more. Many antibacterial/antimicrobial databases are freely available that provide valuable information about the sequences, structures, and signatures of genes and proteins. This information plays a vital role in the drug discovery process in terms of gene/protein and screening of novel synthetic/herbal therapeutic compounds. Selections of important and useful antibacterial/microbial databases are listed in Table 11.1.

Table 11.1 List of antibacterial/antimicrobial databases

5 Future Perspectives

The emergence of drug-resistant bacteria is a major threat facing the world today, and there is a great need to improve the methods by which we investigate antibacterial drug-resistance mechanisms and to discover new therapeutic interventions to combat the issue. Traditional approaches to antibacterial drug discovery are very time consuming and expensive, and thus, the emerging field of computational approaches allows researchers to combine biological and chemical parameters in order to streamline the drug discovery pipeline in a time- and cost-effective manner. Combining computational methods with wet laboratory experiments provides efficient ways to understand the entire mechanism of drug resistance, as well as pathogen virulence and progression. Computational approaches play a significant role in predicting the functions, properties, and activities of antimicrobial agents and their interactions with therapeutic targets throughout the drug discovery process. Bioinformatics provides advanced tools and software that allow microbiologists to analyze and interpret high throughput experimental data in a cost-effective manner, while the use of various databases, tools, and software by R&D laboratories makes practical application that much easier to realize. It is foreseen that the continued synergy of experimental data with computational approaches will lead to a new age of antibacterial drug discovery.