Keywords

1 Introduction

Invasive aspergillosis is a fatal infection caused by pervasive fungus A. fumigatus in immunosuppressed hosts [1]. Qualitative or quantitative deficiencies are manipulated by Aspergillus fumigatus in the host resistant defence to induce intrusive infection. It is extremely difficult to treat this infection in immunocompromised hosts. The fatality from intrusive aspergillosis surpasses 50% even post chemotherapy. In 1996, in U.S., the estimate of curing diagnosed patients of intrusive aspergillosis was $633 M, where the amount of each case was roughly $64,500 [1]. In acute immunosuppressed patients, presently accessible antifungals have moderate efficiency [2]. According to a 2009 article in the journal Emerging Infectious Diseases, exponential increasing issue in curing Aspergillus infection is drug resistance [1, 3, 4]. Moreover, therapeutic medication is usually unsuccessful and is obstructed by harsh side effects of available antifungals. Hence, new drug targets are required which would be successful in curing aspergillosis and have less side effects on host [5].

Traditional approach has been used to screen drug targets and vaccines against aspergillosis for last two decades [5]. The screening of novel potent drug targets is limited due to lack of data, methodology and information. All these major problems have necessitated to screen possible drugs against this hazardous pathogen. Search of new curative drug targets resulted in screening of novel drug candidates. Due to the online availability of newer genes, genomes and protein databases, comparative genomic analysis is considered one of the most reliable strategy for potential drug designing [6, 7]. In the present work, we have attempted to develop a procedure to resolve the restraints of studies done so far. Our computational procedure incorporates whole genome analysis, subtractive/comparative analysis, metabolic pathway analysis and sub cellular localization analysis.

In the present work, we have used subtractive/comparative genome analysis where deduction of identical protein set is accomplished followed by comparison of pathogen and human host genome. Essential gene set is the smallest set of genes necessary for the growth and sustenance of the pathogen. In the proposed work we have adopted in silico approach through subtractive/comparative genomics on the whole proteome of A. fumigatus to identify unique gene set in the metabolic pathways using KEGG and gene set related with the membrane using CELLO [8]. This computational approach is successfully adopted to identify drug targets in many other fungi and bacteria such as Helicobacter pylori [9], Staphylococcus aureus [10], I [11], Bacillus anthracis [12], Neisseria meningitides [13] etc. Hence, we have adopted in silico approach to detect the potential novel drug targets which are virulent and have unique metabolic pathways. However, these putative drug targets should be verified before drug designing.

2 Materials and Methods

Pathogenic proteins of A. fumigatus were screened to a wide-spectrum anti-aspergillus drug target detection procedure that incorporates comparative and subtractive genomics technique. The complete description of protocol of the target detection procedure is shown in Fig. 1.

Fig. 1.
figure 1

Complete description of protocol of target detection procedure

2.1 Program and Data Acquisition of Pathogen Protein

Complete proteome of A. fumigatus was fetched from NCBI protein database (http://www.ncbi.nlm.nih.gov/) and subjected to TiDv2, a target identification software. Fasta file format was chosen among various file formats for drug target identification. TiDv2 is a standalone software developed by us. It is an extension of TiD developed by our team [14]. It includes functions for paralog analysis, essentiality analysis and non-homolog analysis that excludes paralogous proteins, includes essential genes from DEG, CEG or common of both DEG and CEG and excludes host and gut flora homologous proteins from the essential gene dataset [12, 15].

Target prioritization analysis tab characterized the resultant essential gene dataset on druggability and virulence. Virulent proteins are vital for the progression, pathogenicity and sustenance of the pathogen [1].

Target prioritization was done on gutflora non homologous proteins with the help of online target prioritization tools such as Kyoto Encyclopedia of Genes and Genomes (KEGG), INTERPROSCAN and Cello for pathway analysis, functional annotation and subcellular localization respectively. BioPython v2.7.10 scripts were written and integrated with executables and datasets in Microsoft Visual Studio 2015 platform for designing TiDv2. In TiDv2, we incorporated fungal radio button in DEG tab to identify essential genes of fungus.

2.2 Identification of Paralogous Proteins

Target detection and characterization parameters and procedures were followed as mentioned in our earlier work [14]. CD-HIT suite was used to eliminate redundant paralogous proteins from downloaded fungal proteome. The sequence identity cut off value is 60%.

2.3 Selection of Essential Proteins

The selected paralogous proteins were screened to perform BLASTP with DEG at e-value 10−10 and bit score >=100. The resultant genes are vital for the sustenance and progression of the pathogen.

2.4 Identification of Human and Gut Flora Non-homologs

We identified human non-homolog proteins from essential fungal genes based on threshold e-value at 0.005 and bit score >=100. Gutflora non homologous proteins were obtained at e-value 10−4 and bit score >=100. The screened dataset was postulated as possible drug targets and characterized for the possibility of drug and virulence. The resultant gene set was further outlined with UniProt identifiers (http://www.uniprot.org/).

2.5 Metabolic Pathway Analysis

To check the presence of mapped gene set in the metabolic pathways, this dataset of potential targets was characterized at the KAAS server (KEGG Automatic Annotation Server). Further, these targets were screened through BLAST in KEGG database to achieve functional annotation of these putative targets. KAAS server assigns K numbers to sequences that are similar and has bi-directional best hit through programmed procedure that allows establishment of KEGG pathways. The outcome gene set has KO (KEGG Orthology) assignments of sequences that determine metabolic proteins [16].

2.6 Subcellular Localization

We have analysed the metabolic set of proteins in CELLO v2.5 for subcellular localization and biological significance. Subcellular localization detects the potential proteins as cytoplasmic, periplasmic, outer membrane, extra cellular and inner membrane protein [16].

2.7 Identification of Druggability and Virulence Factors

The genes obtained after non homologous analysis with gut flora were further subjected to druggability analysis and virulence analysis. Virulence factor provides new insight in the development of potential anti-fungal drugs [17].

2.8 Selection of Anti Fungal Drug Targets

Finally, proteins obtained after subtractive analysis are putative anti-fungal drug targets. These targets would be experimentally validated before lead discovery. These targets would be potential for the screening of novel anti-fungal drug targets.

3 Results and Discussion

Due to active progression of bacterial proteins, resistance to the present antibiotics is expanding which leads to global health hazard. This entails the exercise to design antibacterial candidates targeted at novel drug targets. The correlation of genomes and gene products have been promoted by the desegregated databases, computerized sequencing of genes, algorithms and tools, which further recommend genome-based drug target designing. Moreover, a computerized, quick and effective methodology to identify drug targets for a specific pathogen from whole proteome of bacteria or fungi develop an effective way to deal with bacterial or fungal proteins expanding with drug resistance in present scenario [14].

The present work is based on advanced comparative and subtractive genomic approach. The unique and vital proteins are critical for A. fumigatus progression, pathogenicity and survival [6]. A compelling approach to tackle the demanding aspergillosis infection is to find target proteins and their essential metabolic pathways. The results of the systematic procedures for mining putative drug targets are presented in Fig. 2.

Fig. 2.
figure 2

Results of putative drug mining

3.1 Identification of Paralogous Proteins

Whole genome sequence (*.faa) of A. fumigatus which is responsible for causing infectious disease was successfully retrieved from NCBI and screened through TiDv2 tool. In the prevailing examination, 56757 proteins were retrieved from NCBI. These proteins were screened in CD-HIT suite at 60% identity value as threshold which produce 10815 non paralogous genes.

3.2 Identification of Essential Protiens

Non-paralogous proteins were run in Essentiality Analysis Tab of TiD. These 10815 proteins were run on BLASTP with common of DEG and CEG at e-value <= 10−10 and bit score >= 100. 1860 proteins were observed to be essential for the growth of A. fumigatus.

3.3 Selection of Human and Gut Flora Non-homologs

1860 essential proteins were screened in Non-Homology Analysis Tab of TiD. Human non homologous proteins were detected by screening through BLASTP at e-value > 0.005. By performing BLASTP, we detected ample analogies between pathogen and human host at e-value < 0.005. Proteins identified as homologous are excluded so that target candidates are safe for human host and has less side effects. We found 453 pathogenic proteins non homologous to human host. Further, out of 453 human non homologous proteins, we identified 428 proteins as non homologous to gutflora at e-value > 10−4.

3.4 Metabolic Pathway Analysis

We found 395 uniprot identifiers when 428 gut flora non homologous proteins were mapped with uniprot identifiers of A. fumigatus. To study the relationship of these genes in metabolic pathways, the mapped geneset of potential drug targets was characterized at the KAAS server (KEGG Automatic Annotation Server). Further, these targets were screened through BLAST in KEGG database to achieve metabolic pathway of these potential targets. The outcome gene set contains KO (KEGG Orthology) assignments of sequences that determine metabolic pathway. There were 13817 total KO assignments in the whole proteome.

3.5 Subcellular Localization

Subcellular localization of probable drug targets in the present examination revealed that 137 drug targets were cytoplasmic, 132 were inner membrane, 81 were periplasmic, 51 were outer membrane and 26 were extra cellular proteins (as shown in Table 1). Position of these putative drug targets is vital in future at the time of designing drug or vaccine. A vital condition in rational drug design is the balance between the subcellular localization report of a drug target with the pharmaceutical features of lead molecules focused to it.

Table 1. Subcellular localization of putative drug targets

3.6 Identification of Druggability and Virulence Factors

We identified 304 proteins as novel drug targets, when 428 gut flora non homologous proteins were screened for druggability analysis. Out of 428 proteins, 74 proteins were found to be virulent. Virulence factor analysis is essential to identify drug targets of the pathogen [17]. The drug targets with virulence factor are vital for inception of infection and perseverance in host. Hence, experimental lead molecules which are developed on the basis of these potential drug targets are integral for formulating a novel curative procedure against pathogens.

3.7 Selection of Anti Fungal Drug Targets

Using subtractive genomic analysis, we detected 5 proteins as novel drug targets having virulence factor and unique metabolic pathway. Out of 5 novel drug targets, 1 was outer membrane protein and 4 were cytoplasmic proteins. The detailed information of proteins with uniport identifier and gene name is shown in Table 2.

Table 2. Detailed information of putative drug targets

This subtractive/comparative genome analysis has been efficiently practiced in various bacterial and fungi genes such as Mycoplasma pneumoniae M129 [18], Staphylococcus aureus N315 [8], Mycobacterium tuberculosis F11 [19], Neisseria meningitides serogroup B for screening of drug targets [13]. The proteins screened in the present study will fairly lead to a positive way in developing potential therapeutic drugs for future researchers.

4 Conclusion

In this era of bioinformatics, previous traditional methods of drug discovery and designing are becoming obsolete. Mammoth of biological database restructures the process of drug discovery and designing procedures. In order to reduce the hazards of dangerous aspergillosis, we have to design novel probable drug targets. In this study, we have executed subtractive genomics and comparative genomics analysis on pathogenic proteins of A. fumigatus and identified 5 novel potential drug targets (His6, FasA, PabaA, FtmA and erg6) for designing of drug and development of vaccine. Moreover, these putative drug targets execute an important role in the vital metabolic pathways. However, outcomes of the present work can be authenticated by further clinical research to confirm their execution in restraining the production and disturbing the virulence factor of A. fumigatus pathogen. Promisingly, this study would help to design new anti-A. fumigatus drugs against aspergillosis. Hence, this study would be undoubtedly favourable because of recent findings which state the growing risk of resistance of A. fumigatus pathogen.