Introduction

Janus kinases are non-receptor protein tyrosine kinases which mediate the signaling of various cytokines pathways, and anomalous regulation of these pathways can lead to various cancerous conditions. For the prevention or control of above-mentioned diseases, inhibition of these cytokine pathways is necessary either directly or by the inhibiting Janus kinases that participate in cytokine-mediating signaling as secondary messengers. JAK2 is involved in multi cytokine pathways along with other JAK members, but its role in IL-6-mediated signaling is significant in various physiological processes such as bone metabolism, acute phase response, hematopoiesis, and B-cell differentiation [1]. Imbalance in IL-6-mediated signaling can increase B-cell differentiation which further leads to lymphoid-derived diseases. JAK3 also plays a vital role in lymphoid development via IL-2 pathway that regulates the activity of distinct lymphoid cell populations including T and B lymphocytes and natural killer (NK) cells [2]. The imbalance in IL-2-mediated signaling also contributes in the pathogenesis of lymphoid-derived diseases. Thus, the development of dual JAK2/3 inhibitors could be efficacious strategy to treat a variety of lymphoid-derived diseases that are dependent on the JAK2 and JAK3 signaling cascade. A drug in clinical trials, AG-490, which is dual JAK2/3 inhibitor, effectively blocks uncontrolled B-cell growth in the patients suffering from acute lymphoblastic leukemia by inhibiting the abnormal constitutive activation of JAK2 detected within these cells [3]. It also blocks the IL-2-mediated cell growth of phytohemagglutinin (PHA) or antigen-specific-activated human T cells through inactivation of JAK3 and STAT5 signaling pathway [4].

In the present study, two separate pharmacophore models were developed for JAK2 and JAK3 enzymes; consequently, the models were employed for the screening of a PHASE molecular database of 1.5 million molecules. The combined approach of pharmacophore-based virtual screening (PBVS) and structure-based virtual screening (SBVS) was utilized in the identification of new dual inhibitory agents for JAK2 and JAK3. We used AG490 as reference for our study so as to get final hits having potency and activity closer to this drug.

Computational methods

Molecular modeling—data selection and processing

Two datasets consisting of 252 JAK2 and 211 JAK3 inhibitor compounds mentioned in Table S1 and S2 (supplementary data), respectively, were selected from the literature [523]. The biological activities of both the datasets extended (IC\(_{50}\) values) over a wide range 0.00009–29.000 and 0.0006–51.300 \(\upmu \)M, respectively. The reported \(\hbox {IC}_{50}\) values were converted into respective \(\hbox {pIC}_{50 }\)values by taking the negative logarithm of the \(\hbox {IC}_{50}\) values (\(-\)log\(\hbox {IC}_{50})\) and were subsequently utilized for the development of pharmacophore model. Maestro, an integrated visualization interface for all Schrödinger software, was used to sketch all compounds [24], and these molecules were optimized using Ligprep [25] employing the OPLS_2005 force field.

Pharmacophore generation

PHASE, a high-performance program module of Schrödinger for ligand-based drug design, was used to generate pharmacophore models [26]. Since most of the studied ligands were flexible, all possible conformations for a ligand were generated using mixed Monte Carlo minimization method (MCMM)/low mode docking (LMOD) approach. For conformation sampling, energy window of 20 kcal/mol was employed to increase the chances of finding the representative conformer close to the bound structure. After conformation generation, ligands were assigned as actives and inactives by giving an appropriate activity threshold value. The activity threshold value was selected on the basis of dataset activity distribution. Using the “create site” option in PHASE, molecules were assigned different pharmacophoric features. PHASE provides six pharmacophoric features: hydrogen bond acceptor (HBA) (A), hydrogen bond donor (HBD) (D), hydrophobic (H), negatively charged (N), positively charged (P), and aromatic ring (R) features. The conformational space of active molecules was utilized for the generation of common pharmacophore hypotheses (CPH), which groups together similar pharmacophores according to their intersite distances. The resulting hypotheses obtained were scored and ranked on the basis of scoring parameters, i.e., survival and survival minus inactive (\(S-I\)) score. The scoring was done to identify the best candidate hypothesis, which provided an overall ranking of all the hypotheses. The hypotheses so generated were then clustered, and a representative model from each cluster was selected on the basis of the highest \(S-I\) score. The selected models were thereafter utilized for the alignment of non-model molecules which were subsequently intended for the development of 3D-QSAR model.

3D-QSAR development

Three-dimensional quantitative structure–activity relationship (QSAR) is the computational method used in the development of relationship between independent (structural components) and dependent variables (biological activity) to obtain a reliable statistical model for prediction of the activities of new molecules [27, 28]. QSAR modeling was carried out using the selected hypotheses by randomly dividing the dataset into training set and test set on the basis of proper variation of activity. For the generation of 3D-QSAR models, all training set molecules aligned over the common pharmacophoric sites were placed into regular 1 Å cubic grids. Each cube was allocated 0 or 1 “bits” to account the different types of atomic features in training set molecules that occupy the cube. Each occupied cube gives rise to one or more volume bits, where a separate bit is allocated for each different categories of atom that occupy the cubes. A large pool of binary values (0 and 1) was formed for the dataset molecules that were treated as independent variable for the generation of QSAR models. The best QSAR model was selected on the basis of high value of \(Q^{2}_{\mathrm{test}}\) (correlation of prediction for test set) and \(R^{2}_{\mathrm{train}}\) (correlation of prediction for training set). The 5 and 7 component (PLS factor) models with good statistics were obtained for the dataset of JAK2 and JAK3, respectively, whereas the maximum number of PLS factors in each model can be 1/5 of the total number of training set molecules. Further increase in the number of PLS factors did not improve the model statistics or predictive ability.

Since pharmacophore model is theoretical model, it is necessary to analyze whether or not the developed model is able to predict the active compounds. Thus, the developed models need to be validated before going for further implementation. The validation of the models was carried out by different validation methods including both internal and external validation. For the validation of the generated QSAR model for its external predictive ability and reliability, test set prediction using Pearson-r was examined. Pearson coefficient of correlation determines the predictive reliability of the generated model for external dataset molecules that have not been considered for the development of model i.e., test set molecules. In addition, the validation for the external predictive ability of the generated model further validated by calculating a set of parameters i.e., \(R_{\mathrm{o}}^{2}\) or \({R_{\mathrm{o}}^{\prime }}^{2}\) close to \(R^{2}, [(R^{2}-R_{\mathrm{o}}^{2})/R^{2}] < 0.1\) or \([R^{2}-{R_{\mathrm{o}}^{\prime }}^{2}/R^{2}] < 0.1\), and the corresponding 0.85 \(\le k \le \) 1.15 or 0.85 \(\le k^{\prime } \le \) 1.15 [29].

The other validation methods include applicability domain (APD) calculation that determines the resemblance between training and test set compounds and Y-randomization which analyze the robustness of the generated model. The APD calculation was carried out using “canvas” program in which similarity measurements were determined on the basis of Euclidean distances between all pairs of training and test set compounds. In this validation parameter, the similarity of test set molecules must reside within the threshold (APD) otherwise selected model is considered to be unreliable for the prediction of new compounds.

$$\begin{aligned} \hbox {APD}=\langle d \rangle +\sigma Z \end{aligned}$$

\(\langle d \rangle \) new average, \(\sigma \) is standard deviation of training set molecules with distances lower than previously calculated average value, and Z is an empirical cutoff with default value of 0.5.

Y-randomization is determined by scrambling the activity data of training set molecules in random manner and to generate different training sets from the original training set. Thereafter, for the random sets, value of \(R^{2}\) was determined and the average value so obtained was reported as \(R_{\mathrm{scramble}}^{2}\). This \(R_{\mathrm{scramble}}^{2}\) was calculated using PHASE module. The value of \(R_{\mathrm{scramble}}^{2}\) should be less than the \(\hbox {R}_{\mathrm{train}}^{2}\) of the original selected model.

Virtual screening

Virtual screening (VS) was performed in order to identify those structures among conformers database which are most likely to bind to a drug target. VS was carried out systematically using integral use of two techniques classified as SBVS and ligand-based virtual screening (LBVS). The database screening using pharmacophore model is a method of LBVS to screen millions of multi-conformational compounds to retrieve structurally diverse hits. To overcome the drawback of LBVS, i.e., lacking the ability to identify false hits, combination of SBVS and LBVS is used as integrated approach in drug discovery protocol.

A PHASE database of 1.5 million molecules was employed for the screening with the selected JAK2 and JAK3 pharmacophore models for the identification of new hits. These hit compounds contain the structural features that overlap the selected model. The hit molecules were further docked using Glide [30, 31] in the JAK2 and JAK3 proteins to remove the false negative and positive hits. Some parameters related to oral bioavailability and pharmacokinetic profile of the designed molecules were computationally calculated using Qikprop [32]. These calculated parameters were finally used for filtering and ranking the large number of screened molecules. Qikprop is based on the principle of Lipinski’s rule of five. According to this rule, poor absorption is expected if molecular weight (MW) \(>\) 500, partition coefficient (logP) \(>\) 5, HBDs \(>\) 5, and HBAs \(>\) 10. The ADME parameters include partition coefficient (\(\textit{QP}log\textit{P}_{\mathrm{o}/\mathrm{w}})\), water solubility (QPlogS), cell permeability (QPPCaco), percentage human oral absorption, etc.

Finally, ligands were sampled for post-processing with Prime/MM–GBSA which predicts the binding energy of set of ligands and a single receptor. The MM–GBSA binding energy (\(\Delta G_{\mathrm{bind}})\) is estimated in kcal/mol using equation:

$$\begin{aligned} \hbox {MM}{-}\hbox {GBSA }\Delta G_{\mathrm{bind}} =\hbox { ER}{:}\hbox {EL}-\hbox {EL}-\hbox {ER} \end{aligned}$$

where ER:EL is prime energy of the optimized complex, EL is prime energy of optimized free ligand, and ER is prime energy of optimized free receptor.

Results and discussion

Pharmacophore generation and 3D QSAR

A pharmacophore modeling study for phenylaminopyrimidine JAK2 inhibitors has been reported in 2011 considering 44 molecules [33]. In the present study, diverse dataset of 252 inhibitors of JAK2 including the above-mentioned series was utilized for the study. For the generation of pharmacophore model, the molecules were divided into active, inactive, and moderately active molecules. For JAK2 inhibitors, threshold value was 9.35 for active ligand and 5.10 for inactive ligand. For this, a total of 16 molecules were considered active, 14 molecules as inactives, and rest were considered moderately active. In the case of JAK3 inhibitors, threshold value was taken 8.20 for actives and 5.00 for inactives that contain total of 16 molecules active and 15 molecules as inactive. The conformational space of each molecule was then sampled using MCMM/LMOD algorithm. A maximum of 1,000 conformers were generated for each molecule within energy window of 20 kcal/mol and root mean square deviation (RMSD) value of 1 Å to remove unnecessary conformers. For the generation of pharmacophore models, the software was restricted to explore minimum of 4 and maximum of 5 sites for both JAK2 and JAK3 inhibitors and these models were restricted to match 16 of 16 and 12 of 16 active molecules, respectively. A total of 17 hypotheses were obtained for JAK2 and 13 for JAK3 inhibitors and were subsequently ranked on the basis of survival score and survival minus inactive (\(S-I\)) score that correspond to score active and score inactive, respectively. The hypotheses so generated had very similar active and inactive score; therefore, in order to avoid selection of similar kind of hypothesis, all the generated hypotheses were clustered and a representative of each cluster was selected on the basis of highest \(S-I\) score. Thus, a total of 5 hypotheses were selected each for JAK2 and JAK3 belonging to different cluster and the statistical parameters of these models are reported in the Table 1.

Table 1 Statistical results of PHASE-generated pharmacophore hypotheses for JAK2 and JAK3 inhibitors

The 3D-QSAR models for both the datasets were generated by dividing the dataset molecules into test and training set molecules considering uniform variation of biological activity of the molecules given in Table 2. The best hypothesis of JAK2 was ADRR.92, indicating that JAK2 inhibitors have one hydrogen bond acceptor (A), one HBD (D), and two ring aromatic (R) features. This hypothesis was selected on the basis of highest \(Q_{\mathrm{test}}^{2}\) value i.e., 0.663 and also showed high \(R_{\mathrm{train}}^{2}\) (0.849) and F value (219.3). The large value of F (Fisher test value) indicated a statistically significant regression model. Similarly, best hypothesis of JAK3 (ADDRR.142) was selected which showed high \(Q^{2}_{\mathrm{test}}\) (0.655) and also showed high \(R^{2}_{\mathrm{train}}\) (0.869) and F value (206.9). The ADDRR.142 comprised five features including one HBA (A), two HBD (D), and two ring aromatic (R) features. The selected models of both JAK2 and JAK3 also rendered the good predictive power over other models. The statistical results of generated QSAR models are mentioned in Table 3. The spatial arrangement of features along with their distance present in four (ADRR.92) and five (ADDRR.142) featured pharmacophore models of JAK2 and JAK3 and its mapping over their corresponding highest active molecules is shown in the Fig. 1a–d. The correlation graphs obtained between experimental and the predicted activity of training set molecules and test set molecules obtained from the best models are displayed in Fig. 2a–d. The high values of Pearson-r for test set molecules i.e., 0.837 (JAK2) and 0.816 (JAK3) described the high external predictive ability of these models. The best models also showed acceptable values of \(k\), 1.007; \(k^{\prime }\), 0.984; \(R_{\mathrm{o}}^{2}, 0.995, {R_{\mathrm{o}}^{\prime }}^{2}\), 0.989 for JAK2 and \(k\), 0.991; \(k^{\prime }\), 1.000; \(R_{\mathrm{o}}^{2}\), 0.993, \({R_{\mathrm{o}}^{\prime }}^{2}\), 1.000 for JAK3, confirming the prediction reliability of both selected models. The values of calculated APD of test set molecules were also observed to be within the range of calculated APD of training set molecules indicating the reliability of models for the prediction of new compounds (Table 4). In Y-randomization, the selected 3D-QSAR models of JAK2 and JAK3 exhibited lower values of \(R_{\mathrm{scramble}}^{2}\), i.e., 0.638 and 0.684, respectively, as compared to corresponding original values of \(R_{\mathrm{train}}^{2}\), confirming the trueness of the selected models. Three features, i.e., HBA (A), HBD (D), and ring aromatic (R) of the selected model of JAK2 inhibitors are similar to that of already published pharmacophore model.

Table 2 Details of dataset for 3D-QSAR of JAK2 and JAK3 inhibitors
Table 3 Statistical results of the generated 3D-QSAR models for JAK2 and JAK3 inhibitors
Table 4 Applicability domain calculations for the test set molecules of JAK2 and JAK3
Fig. 1
figure 1

Intrasite distances of the best pharmacophore models ADRR.92 (a), intrasite distances of ADDRR.142 (c), mapping of ADRR.92 over the highest active molecule JAK2-150 (b), and mapping of ADDRR.142 over the highest active molecule JAK3-43 (d)

Fig. 2
figure 2

Correlation graph between experimental and predicted activities of: training set molecules of JAK2 (a), test set molecules of JAK2 (b), training set molecules of JAK3 (c), and test set molecules of JAK3 (d)

Virtual screening

In our integrated VS protocol, 1.5 million PHASE database molecules were screened through the validated pharmacophore model of JAK2 and 1,000 molecules were retrieved due to restriction of maximum output molecules. These 1,000 molecules were subsequently screened through JAK3 model that retrieved 436 molecules. These 436 molecules contained the structural features of both pharmacophore models but could have different conformations that might not properly interact with JAK2 and JAK3 proteins. To avoid the selection of false negative molecules that did not interact properly with proteins and false positive molecules that show the interactions but not favourable, docking analysis was carried out. A number of 3D structures of both JAK2 and JAK3 in complex with different ligands are available in PDB data bank. For the docking analysis, suitable structures were selected on the basis of cross-docking experiments (Fig 3).

Fig. 3
figure 3

Flowchart of hierarchical virtual screening protocol

Protein selection

Cross-docking is the process of extraction of all ligands from their crystal structures and then redocked those ligands in each crystal protein individually. The redocked ligands are then aligned over their actual crystal ligands, and deviation between the two ligands is assessed as RMSD. The average RMSD of all redocked ligands in each protein was determined to signify the quality of crystal protein. The lower RMSD value represents the ability of the crystal protein to dock the molecules more accurately. Among 28 crystal structures available for JAK2 with different ligands, six (PDB ID: 3KRR, 3E62, 3E63, 3E64, 3UGC, and 4FVQ) were selected on the basis of good resolution \(<\)2 Å. From these, 3UGC showed the lowest RMSD value [3437]. On the other hand, eight crystal structures of JAK3 (PDB ID: 1YVJ, 4HVD, 3PJC, 4HVI, 4HVH, 4HVG, 3LXK, and 3LXL) were available at PDB site [15, 3840]. Among them, 4HVD was selected for further docking processes on the basis of cross-docking experiments. The result of cross-docking analysis is mentioned in Table 5. Thus, crystal proteins 3UGC for JAK2 and 4HVD for JAK3 were selected for further docking analysis of hits retrieved after VS.

Table 5 Essential amino acid residues’ interactions along with average RMSD values of JAK2 and JAK3 proteins

Investigation of important amino acid residues of JAK2 and JAK3 for their inhibition

Not all amino acid residues present in active site of JAK2 and JAK3 are equally important for their inhibition. Hence, relative importance of active site amino acid residues for inhibition of both enzymes was determined using docking analysis of highest active molecule. Additionally, a dual inhibitor of JAK3, compound AG490, which is under clinical development, was also subjected to docking analysis. The interactions shown by highest active molecule of JAK2 and JAK3 and dual inhibitor AG490 with active site amino acid residues of these enzymes are mentioned in Table 6. From the above docking analysis it was concluded that amino acid residues Glu930 and Leu932 are crucial for JAK2 inhibitory activity, whereas Glu903 and Leu905 amino acid residues are important for JAK3 inhibitory activity. Moreover, available crystal structures of JAK2 and JAK3 were visually analyzed for determination of important amino acid residues for inhibition of these enzymes. The amino acid residues found important after visual inspection were complementary to obtained docking results. The interactions shown by crystal structures of JAK2 and JAK3 are mentioned in Table 5.

Table 6 Docking interaction of active molecules and drug molecule along with glide g-score

Docking, ADME, and energy calculation

Docking was carried out to increase the power of pharmacophore-based screening and to differentiate between active and inactive ligands. The 436 retrieved database candidates from VS were docked into the active site of JAK2 protein. Among them, 204 molecules showing the important inhibitory interactions with amino acid residues Glu930 and Leu932 were selected and subsequently docked with JAK3 protein. After visual inspection, 69 compounds were extracted on the basis of interaction of these candidates with important amino acid residues Glu903 and Leu905. These 69 molecules thus obtained possess dual inhibitory activity against JAK2 and JAK3 and showed interaction with both types of proteins.

Finally to filter the molecules with poor pharmacokinetic profiles, ADME properties were calculated. Incorporation of ADME predictions as a part of the drug development process can generate lead compounds that are more likely to exhibit satisfactory ADME performances during clinical trials. A total of 24 candidates out of 69 were filtered out on the basis of pharmacokinetic parameters thereby indicating their potential to act as drug-like molecules. The best 24 dual JAK2/JAK3 inhibitors were further subjected to Prime MM–GBSA for the calculation of binding free energy. The MM–GBSA binding free energy of all ligands along with its pharmacokinetic parameters of the final 24 hits is given in Table 7. The Prime MM–GBSA binding free energy ranging \(-\)59.545 to \(-\)98.069 (JAK2) and \(-\)54.544 to \(-\)77.280 (JAK3) and glide docking score \(-\)12.405 to \(-\)6.414 (JAK2) and \(-\)6.029 to \(-\)10.224 (JAK3) of the selected 24 molecules are comparable to the binding free energy of clinical trial drug AG-490 suggesting good binding affinity with enzymes. The MM–GBSA scores of the selected molecules were also found to be comparable to those of the co-crystal ligands with MM–GBSA scores for JAK2 and JAK3 ligands being \(-\)152.081 and \(-\)79.942, respectively. Interestingly, it was found that basic structural motifs of two hits (Phase ID 0855481 and 0855521) among 24 have been filed as patent by Almirall S.A. as dual JAK2/3 inhibitor [41]. Thus, these 24 candidates could be better drug candidates for targeting JAK2 and JAK3 enzymes. The interaction details along with glide g-score of all 24 final hits are mentioned in Table 8.

Table 7 ADME properties of 24 newly designed multikinase inhibitor molecules using Qikprop module of Maestro 9.2
Table 8 Predicted activities and docking interactions of final 24 molecules after VS

Docking results of one of the hit compounds, i.e., compound 0288703, that showed good dual inhibitory activity are displayed in Fig. 4. The compound 0288703 shows two hydrogen bonding interactions with the JAK2 protein (Fig. 4a). The hydroxyl group and the carbonyl group of the compound interact with carbonyl group of Glu 930 and amino group of Leu 932, respectively. This compound also interacts with JAK3 protein giving five hydrogen bonding interactions (Fig. 4b). The hydroxyl group of the compound forms two hydrogen bonds with Leu 905; the oxygen atom and hydrogen atom of this hydroxyl group give interaction with amino group and carbonyl group of Leu 905, respectively. The nitrogen of pyridine nucleus of the compound shows interaction with NH of the side chain of Arg 911, the carbonyl group of the compound interacts with SH of Cys903, and the NH of the compound interacts with Leu828.

Fig. 4
figure 4

Docking interactions of compound 0288703 with JAK2 protein (a) and JAK3 protein (b)

Conclusion

Multi-target potential of the drug molecules is considered as beneficial for the treatment of multi-pathway diseases. Lymphoid-derived diseases originate from multiple pathways; thus, its intervention with molecules having multi-target potential would serve as better therapy. Two well-known kinases JAK2 and JAK3 reported in the literature are involved in the progression of lymphoid-derived diseases. Thus, our study aimed to design some new molecules describing inhibitory potential for both JAK2 and JAK3.

Two ligand-based pharmacophore models were generated for the dataset of inhibitor molecules of JAK2 and JAK3 to dig out the essential structural features required for inhibition of both enzymes which are helpful for screening of novel molecules having inhibitory activity against both enzymes. The selected models as shown by the correlation statistics and predictive statistics are very much significant to draw explicit inferences. Finally, 24 potential hits with good pharmacokinetic profile and predicted activity were identified by calculating ADME properties and docking analysis, respectively. This study provides a set of guidelines that will greatly help in designing the novel and more potent JAK2/JAK3 dual inhibitors.