Keywords

1 Introduction

The use of genomics holds several purposes and very significant one is the comparative genomics whose data can be used as an antibiotic to retrieve medicinal values for treating the deadly pathogens (Shahbaaz et al. 2016). Analyzation of its genes and its products can be used as a potential drug to understand the very use of it. Similarly, on the other hand, novel drug targets have been identified successfully for various pathogens by using this approach (Georrge and Umrania 2012; John et al. 2012; Georrge and Umrania 2011; Vaishnav et al. 2015; John and Kotadiya 2015; Trivedi and Georrge 2016; Georrge 2016; George and Georrge 2019; Abouelwafa and Georrge 2017; George et al. 2017).

Combinatorial chemistry, which is high throughput screening and also includes in silico virtual screening, has many features and some are into absorption, distribution, metabolism, and excretion–toxicity screening, and de novo and structure-based drug designs serve to expedite as well as economise the modern-day drug discovery process (Baig et al. 2016).

On mere observation, there are various species of Vibrio genus that are affected by deadly pathogens even in aquatic life. The most affected mammals are finfish, shellfish, and mammals. To an extent, Vibrio–host interaction appears to play a critical role in the survival and dissemination of these bacteria in the marine environment. V. splendidus LGP32 is also known as V. splendidus strain Mel32 (Balbi et al. 2013). This stain is caused by significant mortalities in oysters, such as Crassostrea gigas, during the summer. To this, a question arises: what is a genome? Many researchers have stated that genome sequence outpours accurate information about the proteins involved in virulence and pathogenesis. The Vibrio splendidus consists of two circular chromosomes of 3299 Mb in chromosome 1 and 1675 Mb chromosome 2 with an average percent G + C content of 44.03 and 43.64, respectively (Federhen et al. 2014; Le Roux et al. 2009). To add on, it is this genome sequence of LGP32-revealed homologs of genes that has been associated with virulence in other organisms (S-i and Shinoda 2000; Villicana et al. 2019; Zhang and Austin 2005; Coulthurst 2019). The research done here focuses on questioning what is the vital function of this homolog and what is its reaction to the organisms of oysters. Related observation and experiments have proven that strains associated with the “summer mortalities” syndrome in oysters have ended the multifaceted farming of oysters in contemporary life. On vibrant testing, many sectors and departments relating to biology, etiology, and extended natural forums like environmental studies, agent infections, blocks of physiology, and genetic hosts have stood an example for such testing and research (Clerissi et al. 2020; King et al. 2019). From general testing on different samples of aquatic life have developed to another step of producing variable virulence for aquatic animals when tested independently. On a broader scale, populations of oyster experiments have been commonly found in moribund animals, and that these agonisms settled between strains are V. splendidus vibriosis that takes its form to be an interaction of microbe (Bruto et al. 2018; Gay et al. 2004).

2 Materials and Methods

2.1 Identification of Drug Targets

The research switches over to the process of identification in drugs and its target. As discussed earlier, the drug Vibrio splendidus LGP32 was sequenced by the Institut Pasteur, Paris. The reference proteins of V. splendidus LGP32 were retrieved from the NCBI website (www.ncbi.nlm. nih.gov), and this was subjected to tblastn against expressed sequence tag (EST) and nucleotides of Ostreidae, with an E value cutoff of 10−4. The obtained sequences for the shared proteins were subjected to tblastn (Mount 2007; Altschul et al. 1990) against Ostreidae with an E value cutoff of 10−4. The obtained homologous protein set was eliminated. Further, a blast search was performed against the Database of Essential Genes (DEG) (Peng et al. 2017; Gao et al. 2015) http://tubic.tju.edu.cn/deg/ to identify the set of essential genes mandatory for the survival of the selected pathogen, V. splendidus LGP32.

The metabolic pathways (Altman et al. 2013; Aoki-Kinoshita and Kanehisa 2007) involving the essential proteins of V. splendidus LGP32 were elucidated using the KEGG Automatic Annotation Server (KAAS) (www.genome.jp/tools/kaas/). KAAS (Moriya et al. 2007) performs homology search by BLAST against the KEGG database, which is a manually curated gene database and provides the functional annotations of query genes as output. The output provides details of the KEGG Orthology (KO) assignments and the corresponding KEGG pathways. The pathways predicted were assessed for occurrence of alternate pathways and based on the observed results, the proteins were selected as potential drug targets.

2.2 Three-Dimensional Structure Modelling and Validation

Concerning the above target and the process of identification, this research has reached the process of validation by giving a detailed analysis of proteomes of the Vibrio splendidus LGP32, three-dimensional structure of the 3-oxoacyl-(acyl carrier protein) synthase. Due to a lack of template in the PDB, the protein was modelled using I-TASSER, an ab initio protein modelling tool. The I-TASSER server is freely available to the academic community at https://zhanglab.ccmb.med.umich.edu/I-TASSER/. A confidence score (C-score) based on the relative clustering structural density and the consensus significance score of multiple threading templates is introduced to estimate the accuracy of the I-TASSER predictions (Yang et al. 2015).

The next step followed by the validation process is the predicted protein that was performed by the PROCHECK suite and ANOLEA (Atomic Non-Local Environment Assessment). The PROCHECK suite provides a detailed check on the stereochemistry of a protein structure (Laskowski et al. 1996). ANOLEA (Atomic Non-Local Environment Assessment) is a www server that performs energy calculations at the atomic level in protein structures (Melo et al. 1997).

2.3 Docking, Drug Likeliness, and Toxicity Analysis

The analysis of the research carried over consists of around two lac synthetic chemical compounds that were obtained from various literature and databases including the PubChem (http://pubchem.ncbi.nlm.nih.gov/) and drug-like compounds from the NCI database (http://ligand.info).

The Molegro Virtual Docker (MVD) was used to perform docking. The MVD has been shown to yield a higher docking accuracy than other state-of-the-art docking products (Thomsen and Christensen 2006; De Azevedo 2010). It possesses two docking algorithms, the MolDock Optimizer and Simplex Evolution (SE). The receptor-ligand docking was initiated by the selection of receptor from the molecule tab, followed by the cavity preparation. The structure data format (sdf) file of ligands was uploaded and the docking option from the docking wizard was selected.

The Absorption, Distribution, Metabolism, and Excretion (ADME) properties were assessed using Molsoft ADME property (http://www.molsoft.com/mpropdesc.html). ICM is the facility of the Molsoft application. When drawing a compound in ICM, we can monitor important ADME-toxicity and drug-likeness properties. The ICM browser provides researchers with direct access to the rich structural biology resources and protein families. The method is very robust and fast (about 5 K of compounds per second).

3 Results and Discussion

The Vibrio splendidus strain LGP32 is a pathogen of the Crassostrea gigas oysters and associated with their summer mortalities affecting the overall annual production worldwide. This has led to an increased demand for a new drug against the pathogen; however, the number of new drugs identified is low, as the development of a new drug needs a higher investment against a lower market (Duperthuy et al. 2010).

3.1 Identification of Drug Targets

Identification of drug targets revealed that the Vibrio splendidus LGP32 consists of 4432 reference proteins. When subjected to tblastn for ESTs search and a nucleotide blast against the Ostreidae species, it was observed that 3397 proteins showed significantly unique hits. Further, out of the 3397 proteins, 1521 were observed to be essential genes as per the DEG analysis. From these identified 1521 proteins, 256 hypothetical proteins that lacked experimental evidence for their in vivo expression were eliminated for further pathway analysis, thus leaving a total of 1265 essential proteins.

The involvements of drug targets, namely, the identified 1265 essential proteins in metabolic pathways, were analyzed using the KAAS server. The 353 essential proteins out of 1265 could not be predicted by the KAAS server. There were 381 proteins predicted by KAAS server, but not contain pathway ID so, could not possible of detailed analysis of those proteins. Detailed pathway analysis of the remaining 531 proteins revealed that 185 proteins were such that even after targeting them, the organism could survive. Hence, these proteins were omitted, and the remaining very crucial 346 proteins were taken for further downstream processing. Some proteins contain the similar KO pathway ID in single pathway; hence, after eliminating of identical proteins, the remaining 83 proteins were acting as an identical and novel drug target (Fig. 22.1).

Fig. 22.1
figure 1

Summary of target identification

The identified 83 novel drug targets were found to be involved in 25 different pathways/biological processes. The pathways were further divided into 12 classes including the amino acid, lipid, and carbohydrate metabolism; energy metabolism; glycan biosynthesis and metabolism; metabolism of cofactors and vitamins; terpenoids and polyketides; nucleotide metabolism; genetic information processing; cellular processes; environmental information processing; organismal systems; and human diseases. Figure 22.2 details the distribution of novel drug targets across different metabolic pathways/biological processes.

Fig. 22.2
figure 2

Percentage distributions of novel drug targets involved in different metabolic pathways/biological process

The study experimented nearly 83 predicted drug targets where the manual analysis observed is two proteins that could be very potential as drug targets (Table 22.1), which involved fatty acid biosynthesis (KEEG Map ID: ko00061). It is of crucial importance to Gram-negative bacteria, as its mutation or removal leads to the death of the organism (Wang and Quinn 2010a, b).

Table 22.1 Information about the finalized target therapeutics

The 3-oxoacyl-(acyl carrier protein) synthase I (fabB) [EC: 2.3.1.41] is also known as Beta-ketoacyl-acyl-carrier-protein synthase I involved in the fatty acid biosynthesis, which is responsible for the chain-elongation step of dissociated (type II) fatty-acid biosynthesis. 3-oxoacyl-(acyl carrier protein) synthase I (fabB) plays a key role in the synthesis of fatty acid. This enzyme is mainly located in the cytoplasm as well as in the cytoplasmic membrane. The bacterial pathway offers several unique sites for selective inhibition by chemotherapeutic agents. The antibacterial effect is exerted through the selective targeting of beta-ketoacyl-(acyl-carrier-protein) synthase I (FabB) in the synthetic pathway of fatty acids (Hermans et al. 2016).

3.2 Three-Dimensional Structure Modelling and Validation

The 3D structure of the 3-oxoacyl-(acyl carrier protein) synthase I (fabB) was ab initio modelled based on the confidence score, aka C-score (Fig. 22.3). The C-score estimates the quality of predicted models based on the significance of the threading template alignments and convergence parameters of the structure assembly simulations. It typically ranges between −5 to 2, with a higher C- signifying a model with high confidence and vice versa. The predicted structure comprised of 403 amino acids, 3390 bonds, and 3348 atoms. The predicted structure was assessed for its quality by PROCHECK and ANOLEA server. The former uses the Ramachandran plot, and it was observed that the predicted model was of good quality based on the observed values of 93.6%, 5.8%, and 0.6% residues in the most favorable regions, the allowed regions, and the disallowed regions, respectively (Fig. 22.4). A good quality model is expected to have >90% of amino acids in the most favored regions. All the main chain and side chain parameters in the predicted model were falling under the “better” region. Similar to the C-score is the G-factor (should be > − 0.5) that defines the reliability of the predicted model. It is the log-odds score computed by the observed distribution of stereochemical parameters (Morris et al. 1992; Evans 2007). The observed G-factor for the predicted model was 0.00 for dihedral bonds, −1.55 for covalent, and overall −0.67. The distribution of the main chain bond lengths and bond angles were within limits with 74.6% and 80.2%, respectively. Even the planar groups were within limits. The ANOLEA result represents the graphical view of the energy values of each amino acid. Results revealed that most of the amino acids had a negative energy value (Fig. 22.5). For a given amino acid in the targets, negative energy values shown in green represent the favorable energy environment while the unfavorable energy environment is the positive values shown in red (Melo et al. 1997).

Fig. 22.3
figure 3

I-TASSER predicted the 3D structure of the 3-oxoacyl-(acyl carrier protein) synthase I

Fig. 22.4
figure 4

Predicted 3D structure of 3-oxoacyl-(acyl carrier protein) synthase I quality analysis: PROCHECK

Fig. 22.5
figure 5

Predicted 3D structure of 3-oxoacyl-(acyl carrier protein) synthase I quality analysis: ANOLEA

3.3 Virtual Screening and Docking

The concluding step includes the molecular docking that plays a key role in the identification of binding efficiency between the receptor and ligand. The modelled fabB protein docked with around 2,00,000 chemicals by using MVD. The top ten hits based on docking score of energy were shown in Table 22.2, which can block the targeted therapeutics. The top ten chemicals of 3-oxoacyl-(acyl carrier protein) synthase I contain four natural products, one antifungal, three antiviral, and two anticancer molecules.

Table 22.2 Details about the docking energy value of the top ten drug-like molecules for the target 3-oxoacyl-[acyl-carrier-protein] synthase I

The CID 2879872 and CID 2913532 are inhibitors of Sfp phosphopantetheinyl transferase (PPTase) in the bacteria (Yasgar et al. 2010). CID 330973 and CID 3152845 are inhibitors of RecA-Intein splicing activity, DnaB-Intein splicing activity, and GFP chromophore formation in bacteria (Lew and Paulus 2002). CID 16403955 is inhibitors of VIM-2 metallo-beta-lactamase in bacteria (Yamaguchi et al. 2007). CID 5389752 and CID 5389951 are inhibitors of PMK (phosphomevalonate kinase), MK (mevalonate kinase), and DPM-DC (diphosphomevalonate decarboxylase) of the mevalonate pathway in Streptococcus pneumonia (Kudoh et al. 2010). CID 5739314 is an inhibitor of streptokinase A precursor in Streptococcus pyogenes M1 GAS. CID 5389834 is an inhibitor of pyruvate kinase in bacteria (Suzuki et al. 2008). CID 44142679 is an inhibitor of AddAB recombination protein complex and putative recombination protein RecB in bacteria (Marsin et al. 2010; Wang and Maier 2009).

4 Conclusion

The research undertaken on genomics has justified its analysis by finding a solution to ending infections in pathogens. The research gives a systematic analysis of experiments and validations through a comparative genomics approach that can be efficiently applied to retrieve valuable information about the drug target molecule to be used as the treatment of various infections caused by pathogens. It is mainly based on the idea to distinguish the genes between the pathogen and host for narrowing down to the organism-specific genes to be tested as potential drug targets. There is a constant need to keep looking for novel drug molecules as a means of protection against those pathogens that are resistant to available antibiotics. The advances in various in silico-based approaches is allowing the screening of multiple proteins predicted as potential drug targets.

Here we described the entire approach for designing a drug target that can block 3-oxoacyl-(acyl carrier protein) synthase I proteins. It explores the possibilities of creating new drugs from a list of available chemical molecules. The microbes are attaining resistance against the existing drugs; the usage of the drug is one of the finest discoveries done to detect the problems of pathogens and mammals in aquatic life. Hence, designing novel-effective drugs should be made available to real life as a medicinal aid to research problems. In context to the same, the described chapter as a case study can be the best replacement for available existing treatments.

The present study contributes to the identification of ten potential inhibitors, which can combat the pathogenic microorganisms. Relevant studies reveal that these are the best druglike molecules for the Vibrio splendidus LGP32. There should be the biological confirmation of these selected druglike molecules for checking their efficiency against the organism. Therefore, the coming future could be identified through druglike molecules as a biological confirmed drug by using many authentic approaches and analysis. For example, the cup borer method can be used as an innovative approach to understanding the identification of medicinal properties through genomics producing molecules that are druglike.