Introduction

Malaria, a life threatening disease, is caused by five species of the genus Plasmodium: P. falciparum, P. malariae, P. ovale, P. knowlesi and P. vivax, of which P. falciparum and P. vivax are most dangerous [1]. According to the December 2013 WHO Malaria report, about 207 million new cases of malaria were reported in 2012 with estimated 627,000 deaths. Since 2000, the global malaria death rate has fallen by 42 % compared to 49 % in the African region. In the Southeast Asia region, about 1.2 billion people are exposed to malaria, most of which are from India. It is estimated that 2–3 million new cases arise each year. According to the National Vector Borne Disease Control Program (NVBDCP), the annual parasite incidence (API) was \(>\)5 in Rajasthan, Gujarat, Goa, Karnataka, Southern Madhya Pradesh, Chhattisgarh and Orissa [2].

Malaria is transferred from one patient to another by the female mosquito of the genus Anopheles. There are 460 species of the Anopheles mosquito identified out of which only 30–40 can transmit the parasite of the genus Plasmodium. Parasites are transferred via different species of Anopheles in different geographical location, A. gambiae being the most common in Africa and A. culicifacies in India [3].

Parasites are transferred from human to mosquito to human again through blood of plasmodium-infected person by female anopheles mosquitoes. In a human host, the parasite undergoes asexual multiplication where it first replicates in the liver and subsequently in erythrocytes. Gametocytes are released which can either destroyed by the host immune system or they may transferred into the midgut of the female Anopheles mosquito when it feeds on infected humans. In mosquitoes, midgut sexual replication takes place to form mature sporozoites which then travel into the salivary glands and are reinjected into a human host on next feeding. The parasite life cycle largely resides within the human liver and blood cells evading direct access to immune surveillance and remains invisible. However, spleen recycles circulating red blood cells (RBCs); therefore, as a protective mechanism, P. falciparum induces specific adhesive proteins on the surface of infected RBCs. These adhesive proteins allow infected RBCs to stick on the walls of blood vessels which prevents their passage through the spleen and prolong the circulatory half-life of infected RBCs [4]. Multiple gene expression analyses have been carried out in the past that allowed understanding the changes in the transcriptome of P. falciparum over its developmental cycle. Gene expression data had been obtained from supplementary materials provided by Le Roch et al. [5] and Foth et al. [6], which portray genome-scale transcriptome analysis of P. falciparum including nine stages during development.

We analysed it in vitro gene expression data followed by an flux balance analysis. Flux balance analysis (FBA) is widely used to the study the genome-scale metabolic network reconstruction based on mass conservation. By calculating a flow of metabolite through the metabolic network, FBA predicts the growth rate of the organism as whole or rate of utilization of any specific metabolite. FBA requires the information about the stoichiometry of the reaction pathway [7]. Network-based analyses of P. falciparum metabolism have been studied by Yeh et al. [8], Fatumo et al. [9], and Huthmacher et al. [10]. They found drug targets based on choke point reaction studies and in silico knockout studies. However, Yeh et al. [8] and Fatumo et al. [9] did not consider the life cycle stages in their studies. Inclusion of life cycle stages is important as the availability of nutrients in the different environment is different. This was included in the study carried out by Huthmacher et al. [10].

To date, there is no vaccine available against the parasite because of the fact that the parasite can evade host immune system [11]. This is further complicated due to the fact that the parasite acquires resistance against existing drugs, altering the sequence of the drug target without compromising function [12]. Development of multi-drug resistance in parasites is a problem and their spread could be disastrous [13]. Therefore, designing drugs that can affect multiple targets simultaneously would be effective as modifying multiple targets without lethally will be difficult for the parasite.

In silico drug designing and computer-aided drug designing (CADD) are knowledge-based approaches. The goal of our study is to explore the erythrocytic stage-specific metabolism of P. falciparum for the identification of crucial targets involved in the parasite development process before it acquires the ability to evade the immune system. We perused through the available omics resources of stage-specific expression, biomass equations and their flux distribution with respect to gene expression for target identification. Protein similar to predicted protein target from P. falciparum was also found to be present in humans. Therefore, candidate molecules were docked on both proteins, and the ratio of docking score was used to screen for new lead compounds that bind more selectively to P. falciparum.

Materials and methods

Gene expression analysis

Gene expression data were obtained from the supplementary materials provided by Le Roch et al. [5]. This genome-scale transcriptome analysis of P. falciparum includes nine stages of parasite gene expression during development. Quantitative time course profiling of P. falciparum transcripts and proteins throughout the 48-h intraerythrocytic development cycle was collected from Gene Expression Omnibus (GEO) Accession No. GSE 24416 [14].

The data were normalized and transformed using the R Bioconductor Statistical package limma [15]. Differential expression was carried out in CLC Main Workbench and ANOVA was calculated [16]. Genes were filtered out on the basis of p value \(<\)0.05 and fold change difference \(>\)1. The significantly expressed genes were plotted on the P. falciparum pathway using the Omics viewer [17].

Flux balance analysis

The first step in FBA was a reconstruction of the metabolic network to determine the stoichiometry of the reactions achieved by using available well-curated metabolic models and enzyme databases. A careful literature search was performed to identify all metabolic reactions, compartmentalization of enzymes, transport reactions for additional metabolites and secretions [7, 18].

The next step was to convert the reconstruction into a mathematical model in the form of a numerical matrix (S) of the stoichiometric coefficient of each reaction where metabolite (m) and reaction (n) are presented in row and column, respectively. Moreover, a negative or positive sign associated with the coefficient depicts metabolites either consumed or produced in the reaction, respectively. If a metabolite does not participate in any reaction, zero represented it. For a matrix S, all material balances are as per following equation:

$$\begin{aligned} \frac{\mathrm{d}X}{\mathrm{d} T}=S.v, \end{aligned}$$
(1)

where \(v = \hbox {vector}\) representing flux through the system and X representing m dimensional vector of metabolite amounts per cell. Flux through each reaction at the steady state is represented by following equation,

$$\begin{aligned} S.v=0 \end{aligned}$$
(2)

Further constraints

$$\begin{aligned} \alpha < \nu <\beta \end{aligned}$$

where \(\alpha \) is lower bound and \(\upbeta \) is upper bounds on v.

This balance and bound define the allowable flux distribution of the system.

The third step in an FBA was to define a biological objective, i.e., the problem being studied, for example, growth or ATP production. Mathematically, the objective is represented by an objective function that specifies the contribution of each reaction to the objective. The mathematical representation (S) and the objective were solved using linear programming [7].

Pathway Tools version 17.5 was used for FBA. A free academic license was received from SRI International and Biocyc.org [17].

Metabolic reconstruction was derived from the Biocyc database [19] and annotated with databases like the Malaria Parasite Metabolic Pathway (MPMP) [20], KEGG [21] and PlasmoDB [22]. Metabolic pathway information was firstly curated with KEGG to link enzyme and gene identifiers which revealed an understanding of the metabolic network of P. falciparum and finally curated the data with Biocyc. The resulting information was used to carry out an FBA in pathway tools [17].

Pharmacophore generation

A pharmacophore profile was developed using the freely available web server Pharmagist [23]. Pharmagist can generate a pharmacophore from a set of structures of drug-like molecules that are known to bind to the receptor, without the actual target receptor structure [23]. The pharmacophore was submitted to ZINCPharmer [24], and filtering was performed based on molecular weight \(\le \)500 and number of rotatable bonds \(\le \)10 with maximum hits per conformation and molecule to be one. Compounds that matched these criteria were again filtered using Lipinski’s rule of 5 in Knime [25].

Compounds collected from different databases along with the known inhibitors were docked (rigid docking) on target protein from both P. falciparum and Homo sapiens. The ratio of docking score between the two organisms was calculated, and the lowest ratio of the known inhibitors was used as a threshold to filter the compounds.

From the compound list, substrate, product or any intermediate metabolite were removed. The remaining compounds were docked (flexible docking) keeping 5 conformations for each ligand in CLC drug discovery workbench, and a ratio of pfaODC over hODC docking score was calculated. Compounds having a docking score ratio greater than known inhibitors, with known experimental \(\mathrm{IC}_{50}\) values, were selected for binding analysis.

ADMET prediction was carried out using the online tool admetSAR (http://www.admetexp.org), an ADMET structure–activity relationship database with data collected and curated from the literature [26].

Results and discussion

Gene expression analysis

Variations in the probe intensity value between arrays of the dataset were compared using Boxplot [27]. Figure 1a and b represent the Boxplot of raw and normalized data, respectively, and Fig. 2 represents Boxplot of time course data GSE24416 analysed using GEO2R in GEO [14]. By default, these graphs are constructed after \(\log _{2}\) transformation. Upper and lower quartiles are represented by either end of the box, whereas the median is represented by a line in the centre of the box. Values greater than 1.5 times of interquartile range from first and third quartiles, at edges of the box, were termed as outliers. Horizontal lines represent largest and smallest values of an outlier on the either side of the box where outliers are represented by circles. Boxplot shows that the each array had different probe intensity; hence, data normalization was performed.

Fig. 1
figure 1

Box plots a raw microarray data of P. falciparum 3D7 b after transformation and normalization

Fig. 2
figure 2

Box plot of time course data from GEO (Accession no. 24416)

By applying one-way analysis of variance (ANOVA) to the time course data, we found that out of 5100 genes, 3021 genes were differentially expressed having a p value \(< 0.05\) and Fold Change (difference) \(>\)1. These data were plotted as volcano plot (Fig. 3) with significance of differentially expressed gene versus fold change on the Y and X axis, respectively. The X-axis consists of the log of fold change difference and on the Y axis a negative log of the p value is plotted (base 10). Plotting points in this way clusters transcripts in two regions of interest: those that are found towards the top that are far left or right, these represent values of large fold change and also high statistically significant (towards the top).

Fig. 3
figure 3

Volcano plot showing statistically significant genes p value \(<\)0.05 with a fold change (difference) \(>\)1

Flux balance analysis and reaction knockout studies

FBA under development mode in pathway tools [17] stratified metabolites in two categories. First, metabolites with a high flux in biomass and present in the nutrient section suggest that a transporter may be present for those metabolites and they were excluded from the analysis. Second, metabolites with a high flux and no corresponding presence in the nutrient section suggest that they are synthesized by the parasite. The second category includes metabolites such as thiamine diphosphate, glutathione, purine pyrimidines, and putrescine belonging to groups such as nucleoside and nucleotide biosynthesis, electron carrier biosynthesis, and polyamines biosynthesis, respectively, with putrescine showing ten times high flux values compared to others.

Enzymes involved in the synthesis of the metabolites listed in the second category were selected as possible targets from our flux balance analysis. Thiamine diphosphate is synthesized from thiamine phosphate by activity of thiamine diphosphokinase enzyme; however, this enzyme was not differentially expressed and hence excluded. Moreover, electron carriers, such as thioredoxin reductase and glutathione reductase, which are essential for survival against oxidative stress, were also excluded because of the amino acids of these targets contribute to drug binding are situated on a flexible side chain [28] makes it difficult to design drug with high specificity.

The next target group was purine and pyrimidine synthesis. Due to the inability of purine synthesis by Plasmodium, purines were acquired through transporters, whereas pyrimidines are synthesized de novo and salvage has not been observed. Current antimalarials, such as pyrimethamine and sulfadoxine are targeted against pyrimidine synthesis and folate metabolism which disrupts the constant supply of tetrahydrofolate cofactors required for one carbon transfer reactions and DNA replication [29]. Moreover, inhibitors like 1-hydroxy-2-dodecyl-4(1H) quinolone (HDQ) used to inhibit mitochondrial enzyme dihydroorotate dehydrogenase are required for de novo pyrimidine synthesis in P. falciparum [30].

Polyamines, such as putrescine, spermidine, and spermine, are compounds that contain two or more amino groups. At alkaline pH, they interact electrostatically with anionic macromolecules, stabilizing DNA, RNA and nucleoside triphosphates, proteins and phospholipids. Interactions with polyamines can alter DNA conformation, replication and transcription regulation, membrane integrity, ion channel regulation and protection against oxidative stress. The transporter for polyamines is yet to be identified in the Plasmodium genome [31].

Putrescine, a polyamine required for cell growth, is a substrate for spermidine synthesis. S-adenosyl-l-methionine decarboxylase–ornithine decarboxylase (AdometDC/ODC) is a bifunctional enzyme catalysing both l-ornithine to putrescine and S-adenosyl-l-methionine to S-adenosyl-l-methioninamine. This enzyme also catalyzes the reaction for conversion of arginine to agmatine, another source of putrescine, which is why this reaction in not a choke point reaction.

Moreover, for a stage-specific target, the target should follow a sigmoidal curve, i.e., it should be up-regulated in one stage and down-regulated in the successive stage. Inhibiting such target may lead to stage arrest. A similar pattern is observed for pfaODC (Fig. 4a, b) in the profile graph and heatmap of pfaODC from strain Dd2 and 3D7, respectively. Hence, we hypothesized that polyamines need to be synthesized by the parasite in the phase transition from ring to trophozoite stage and not in the schizont stage which strongly correlates with in vitro studies where inhibition of ODC in the malaria parasite by difluoromethyl ornithine (DFMO) causes schizont arrest [3234].

Fig. 4
figure 4

Profile graph of pfaODC strain Dd2 from quantitative time course data (a), Heat map representing expression pattern of pfaODC strain 3D7 (shaded) (b)

It was observed that removing the reaction ARGDECARBOX-RXN causes the blockage of agmatine. The reaction catalysing conversion of l-ornithine to putrescine did not carry any flux; therefore, its knockout studies could not be carried out (data not shown). Protozoan infections, such as chagas disease (Trypanosoma cruzi), African sleeping sickness (caused by subspecies Trypanosoma brucei), leishmaniasis (Leishmania species) and Malaria (Plasmodium species), are highly dependent on polyamine for growth [35].

Structural comparison of ornithine decarboxylase

To find the unique insertions in a sequence of pfaODC, the amino acid sequence of pfaODC was compared with the sequence of L. donovani, M. musculus, H. sapiens and T. brucei using multiple pair-wise alignment. There are two unique insertion areas that become apparent in pfaODC sequence. One was between residues 1047–1058 and 1139–1296, and another at the hinge region between residues 573–837. The largest insertion of 158 residues and the hinge region were removed for a satisfactory outcome. Pair-wise alignment showed the highest identity of pfaODC amino acid sequence with T. brucei enzyme with a score of 30.49 and with H. sapiens with a score of 29.2683. Figure 5a and b show the rooted and unrooted phylogenetic tree, respectively, of the alignment. From the tree, it is observed that there is a close evolutionary relationship between T. brucei and P. falciparum ODC, and the enzyme is distantly related to human and mouse.

Fig. 5
figure 5

Phylogeny of ornithine decarboxylase from various sources (a and b). c Structure of pfaODC (PDB ID: 1M9V) modelled on T. brucei ODC as template with PLP-DMO as ligand d structure of hODC (PDB ID: 2OO0) with APA as ligand

The homology-modelled structure of pfaODC was created by Birkholtz et al. [30] using T. brucei crystal structure (PDB ID: 1M9V). To target the unique sequence of pfaODC, 38 malaria-specific amino acid residues were modelled ab initio and rest of the amino acids were discarded. Based on proven functional residue present in the active site pocket of human and Trypanosoma crystal structure, active site residues present in the query sequence were identified. The corresponding residues as per crystal structure of Trypanosoma ODC (PDB ID: 1M9V) include Lys69, Arg154, His197, Gly235-237, Glu274, Arg277, Tyr389, Asp332, Cys360 and Asp361 [32].

Figure 5c shows the structure of pfaODC (modelled using 1M9V as template) bound with PLP-DMO (Pyridoxal phosphate-Difluoromethylornithine) as ligand, and Fig. 5d shows the crystal structure of hODC (PDB ID: 2OO0) with APA as bound ligand. In the mammalian system, ODC is tightly regulated via transcription, translation and post-translation mechanism. It has a shortest half-life of 10–20 min, which is further reduce to \(<\)5 min upon complex formation with an antizyme- ornithine decarboxylase antizyme (ODC-AZ). ODC-AZ is a polyamine regulated inducible enzyme-inhibitor that binds to ODC monomers and targets it for 26S proteasome-mediated degradation [36]. Moreover, the level of polyamines itself is regulated by interconversion of individual pools and by transport system allowing import and export of polyamines and their intermediates [37].

Pharmacophore analysis

A pharmacophore is a spatial arrangement of features that are important for a molecule to interact with a target receptor. A pharmacophore was designed based on known inhibitor analogues of ornithine decarboxylase (DFMO, difluoromethyl putrescine, ornithine, APA, 1,4-diaminobutanone, 3-aminooxy-2-fluoropropanamine and 2-fluoromethyl ornithine). Out of 7 compounds submitted for pharmacophore design, only 4 molecules (DFMO, ornithine, APA and 3-aminooxy-2-fluoropropanamine) were aligned resulting in 1 output pharmacophore with a score of 24.80. Figure 6 shows the pharmacophore obtained from the Pharmagist web server and the features shared by the aligned molecules.

Docking studies

A total of 53,621 compounds were obtained from the ZINC database [38] using the pharmacophore generated in the ZINCPharmer search with a cut-off adjusted to molecular weight and number of rotatable bonds. The results were downloaded in SDF format for further analysis. These were again filtered based on the Lipinski’s rule of 5 using Knime which resulted in 52,687 compounds that were further processed.

Further, the PubChem database [39] was searched for analogues of ornithine, putrescine, DFMO, DFMP, APA and 1,4-diaminobutanone. A total of 158 analogues were obtained from the PubChem database, which satisfy Lipinski’s rule of 5.

These compounds, along with the known inhibitors, were docked (rigid docking) on pfaODC and hODC (PDB ID: 1M9V and 2OO0, respectively) on the inhibitor binding site with a grid radius of 10 Å. Knowledge of the preferred orientation was used to anticipate the binding affinity using the scoring function.

ODC is present in human as well as P. falciparum. hODC is short lived and recycled rapidly by ODC-AZ. Therefore, the ODC expression is always turned on in humans. \(\mathrm{IC}_{50}\) value against P. falciparum would be higher if a compound with a higher affinity toward hODC is used for inhibiting the pfaODC in vivo. An increase in \(\mathrm{IC}_{50}\) value could be attributed to non-target binding as well as the inhibited hODC enzyme will quickly replenish. Therefore, a pfaODC inhibitor must have a higher affinity toward pfaODC than hODC. This was determined by taking the ratio of a docking score of a compound on pfaODC and hODC. Lowest ratio of known inhibitor was used as cut-off for screening of potential candidates.

The lowest ratio of 0.83 was observed for the known inhibitor CGP52622A. Based on this cut-off value, out of 52,687 compounds from the ZINC database, only 2 compounds were selected. Moreover, out of 158 compounds initially screened from PubChem based on a pharmacophore, a total of 147 compounds had a docking ratio greater than that of known inhibitor.

Fig. 6
figure 6

Comparative docking score of top compounds on pfaODC and hODC

The substrate, product, any intermediate metabolite and duplicate entries were removed, and 47 compounds were obtained including 10 known inhibitors. \(\mathrm{IC}_{50}\) values for four of these compounds were reported against pfa and human [40], and the \(\mathrm{IC}_{50}\) value of only one compound was reported against Leishmania amazonensis [41]. These 47 compounds were docked (flexible docking) keeping the top five conformations for each ligand in CLC drug discovery workbench [16], and a docking score ratio of pfa over human was calculated. Compounds with a ratio greater than those of known inhibitors were selected. Also, the octanol-water partition coefficient was considered, and compounds with a value \(\ge 0\) were removed. As a result, 43 compounds were obtained having a docking ratio \(>0.83\) (ratio of CGP52622A) and XLog p values \(<0\). Figures 6 and 7 depict the docking scores of the top 15 molecules in pfaODC and hODC and their docking score ratio, respectively. Table 1 consists of the docking score and ratio of known inhibitors along with their \(\mathrm{IC}_{50}\) values. Table 2 presents the docking scores and ratios for potential inhibitors.

Fig. 7
figure 7

Ratio of docking score between pfaODC and hODC

Table 1 Docking score and ratio (pfa/human) calculated for known inhibitors
Table 2 Docking score and ratio calculated for probable inhibitors

Binding analysis

Amino acids involved in binding with known inhibitors were identified and compared with the binding of potential inhibitors (Fig. 8a, b). Table 3 shows amino acids important for binding with known and potential inhibitors. From this Table, it can be seen that potential inhibitors also bind with the same amino acids where the known inhibitors bind, except for Asp361 A OD1 (interaction with APA only). However, potential inhibitors could have additional interactions with Asp736 B OD2, Gly362 A N and Tyr540 B OH which suggests a stronger target affinity than that of known inhibitors.

Fig. 8
figure 8

Binding analysis of known inhibitors (a) and probable inhibitors (b) on pfaODC

Table 3 Amino acids involved in binding of known and probable inhibitors

ADMET prediction

ADME/T is an abbreviation for absorption, distribution, metabolism, excretion and toxicity in pharmacokinetic studies. These criteria impact drug exposure to tissue and the drug profile of a potential therapeutic agent. Predicted ADMET properties of known and potential enzyme inhibitors are displayed in Table 4. In metabolism, it is predicted whether the compound is a substrate or inhibitor of Cytochrome P450, a membrane-associated protein located in the inner mitochondrial membrane or in the endoplasmic reticulum of the cells, which metabolize thousands of endogenous and exogenous chemicals. Carcinogenicity for the rat was extracted from the carcinogenic potency database (CPDB) [42]. According to \(\mathrm{TD}_{50}\) values, a compound can be classified into 3 classes: Danger, Warning and Not Required. Carcinogenic Compounds with \(\mathrm{TD}_{50} \le \)10 and \(>\)10 mg/kg body weight/day were assigned as Danger and Warning, respectively, and non-carcinogenic were assigned Not Required. Oral toxicity was classified into four categories based on US EPA criteria: Category I, II, III and IV contain compounds with \(\mathrm{LD}_{50}\) values 50 mg/kg, 50 mg/kg to 500 mg/kg, 500 mg/kg to 5000 mg/kg and \(>\)5000 mg/kg, respectively. Rat acute toxicity indicates toxicity in rats caused by oral exposure to chemicals. Fish acute toxicity suggests the lethal dose to kill 50 % of fish in the study.

From Table 4, it can be observed that molecule 19078818 is a CYP2D6 substrate, a member of Cytochrome P450 oxidase system involved in the metabolism of xenobiotics, mainly found in liver. This compound could be rapidly eliminated from the body and increasing its dosage could lead to oral toxicity as it lies in oral toxicity class 2.

Table 4 ADME/T prediction of top 10 probable inhibitors

Conclusion

To obtain a deep insight into the functioning of the parasite in RBC, analysis of expression data combined with a flux balance analysis was used in this study. Our analysis indicates that the Ornithine Decarboxylase enzyme is crucial for the transition from one stage to the other and identified as a putative target. This target was analysed using an in silico knockout study. The ideal target must be absent or redundant in the host to prevent undesired alteration in normal functioning of the host system. ODC is, however, present in both host and parasite. Therefore, a comparative docking study was utilized to screen compounds against targets present in both organisms to select compounds that will bind efficiently with the parasite target and not with a human counterpart. The result of this study will help in selecting compounds against the parasite target to minimize the time and cost of synthesizing and screening a large number of compounds in vitro.