Introduction

Cancer is a disease where malignant cells uncontrollably divide, such cells can propagate, invade and kill healthy tissues or organs via metastasis (Varmus 2006; Nishida et al. 2006). In 2020, the World Health Organization announced that cancer is the second leading cause of death worldwide. The economic effect of cancer is considerable and growing worldwide. The formation of new blood vessels via the angiogenesis process will be controlled by activator molecules and is significant for proliferation and metastasis. Cancer cells rely upon angiogenesis for oxygen and nutrient supply (Folkman 1995). Serine proteases, human kallikrein-related peptidase (KLKs) are known to be expressed in various cancers. KLKs contribute to pathological angiogenesis by the degradation of the extracellular matrix (ECM) components (Lilja 1985; Emami and Diamandis 2008). Thus, targeting the KLK cascade can provide a novel inhibitor for improving cancer survival.

The human kallikrein-related peptidase (KLK) family includes 15 (KLK-1 to KLK-15) highly conserved serine proteases. The protein KLK-14 is overexpressed in prostate, ovarian, and breast cancers (Borgoño and Diamandis 2004). KLK proteins are pre-proenzymes containing an amino-terminal signal sequence (Pre), a propeptide (Pro) that maintains them as inactive precursors (zymogens), and a serine-protease domain that is responsible for catalytic action. KLK-14 has Pre peptide sequence up to Ser34, Pro peptide sequence up to Lys40 & Ile41 is activation cleavage site, and subsequently a catalytic domain (Webber et al. 1995; Borgoño et al. 2004; Kalinska et al. 2016).

KLK-14 exhibits a substrate specificity to both Trypsin-like (arginine or lysine) and Chymotrypsin-like (tyrosine, tryptophan, or phenylalanine) domain. The amino acids of the catalytic triad in KLK- 4 are His83, Asp127, and Ser220 that utilizes the serine residue (Ser220) for a nucleophilic attack on the substrate. KLK-14 facilitates angiogenesis through degradation of the broad multi-domain ECM glycoprotein fibronectin (Borgoño et al. 2007b, a). Figure 1 shows the degradation of the ECM of fibronectin ultimately leading to the bioavailability of the VEGF (Wijelath et al. 2006).

Fig. 1
figure 1

Biochemical pathway of KLK-14 protein

Angiogenesis and tumor growth are pragmatic in a diverse range of cancers, attributed to the decreased fibronectin expression or increased fibronectin degradation (Hynes 1990). KLK7 and KLK3 are also able to cleave fibronectin in prostate cancer (Ramani and Haun 2008). A recent review of the literature suggests that KLK-14 protein is a novel target against angiogenesis. Hence the present study deals with recognizing new molecular entities to prevent angiogenesis by targeting the KLK-14 with the fibronectin interactions.

Materials and methods

Homology modeling, a computational method is used to determine the 3D structure of the target protein (If not reported in either X-ray crystallography, NMR spectroscopy, or electron microscopy). Homology modeling is a multi-step process involving sequence retrieval of the target protein, 3D model building using an appropriate template, evaluation, refinement of the predicted 3D structure, and identification of the target protein's active site. As a final point, virtual screening at the active site, followed by a prediction of ADME properties of the docked ligand (Nambigari et al. 2012; Bandi and Nambigari 2021).

Functions in cells are controlled by interactions between biomolecules, such as proteins, nucleic acids and small ligands. Understanding such interactions is therefore a crucial step in the investigation of biological systems and in drug design. The binding affinity of a complex, or the Gibbs free energy (∆G) is a crucial quantity for the study of such systems since it determines whether an interaction will actually occur or not in the cell. PROtein binDIng enerGY (PRODIGY), a web-server for the prediction of binding affinity in protein–protein complexes (Vangone 2019).

Homology modeling of KLK-14 protein

The KLK-14 protein sequence (267 amino acids with Uniprot ID Q9P0G3) is retrieved from the Uniprot, Expert Protein Analysis System (ExPASy) Server in the FASTA format (Gasteiger 2003; Artimo et al. 2012; 2015). NCBI-BLASTp (Pertsemlidis and Fondon 2001) and JPred (Cuff and Barton 1999)and HHPred (Zimmermann et al. 2018) servers give a protein (Template) identical to the target protein sequence on the basis of parameters (sequence similarity & secondary structure), query coverage, percent identity and E-score parameters (Kerfeld 2011). The alignment with the template with the target protein sequence is carried out using Clustal W (Ahola et al. 2006). The initial 3D structure of the KLK—14 protein is generated using the Modeller 9v9 (Šali et al. 1995). Initially 200 models are generated of which based on the model with lowest modeling objective function is selected for further refinement. The generated 3D model is further refined by loop modeling and energy minimization using Swiss PDB Viewer 4.1.0. (Kaplan et al. 2001). The superiority of the model is advanced by performing the molecular dynamics simulations study via locPREFMD (local Protein structure refinement via Molecular Dynamics) web tool (Feig 2016).

Validation of 3D model of KLK-14 protein

The 3D model of KLK-14 protein is refined further by loop modeling and energy minimization by the conjugate gradient method until the final convergence criteria are satisfied using a Swiss PDB viewer (SPDBV4.0) (Guex and Peitsch 1997). PROCHECK server calculates the dihedral angles (Psi and Phi angles) of the 3D model via a Ramachandran Plot and validates the stereochemical quality of the model (Laskowski et al. 1993). The ProSA server validates the overall fold and local model quality of the target protein by comparing the structures of proteins deposited in PDB with identical amino acid lengths (Wiederstein and Sippl 2007). The program VERIFY3D (David Eisenberg et al. 1997) gives a measurement of the compatibility of a protein 3D profile based on the statistical preferences (acceptable score of > 70% residues with 3D-1D scores).

Active site identification and protein–protein docking

A Variety of computational tools are used for active site identification and analysis, a significant step in drug design. The CASTp and Active Site Prediction Servers were used to predict active site regions of the KLK-14 protein (Dundas et al. 2006). These server tools calculate the area and volume of the cavity to find the active site domain. Further protein–protein docking is performed by employing a Fast Fourier Transformation method of Patch Dock server (Duhovny et al. 2002; Schneidman-Duhovny et al. 2005). The input used for docking is the natural substrate as receptor (fibronectin, PDB ID: 1FNH, resolution of 2.1 Å) extracted from the protein data bank and the 3D model of KLK 14. The receptor was prepared by removing all hetero atoms and water followed by adding polar hydrogen's. The RMSD was set to 4 Å, with the rest of the parameters at default settings. To validate the putative binding residues, the docked complex of KLK-14 & fibronectin is analyzed.

Virtual screening

The drug discovery research involves virtual screening (VS) as an essential method for evaluating large compound libraries to discover new drugs. The ligand-target approach has become incredibly popular, where the use of the number of latest techniques and software is increasing (Lavecchia and Giovanni 2013). The procedure of VS includes the protein preparation, the database preparation of ligands, and the docking (Lionta et al. 2014; Huang et al. 2016). Protein surface atoms and site points are also calculated internally in the docking software. Computational methods predict the best ligand hits that 'dock' ligand library into target protein and 'score' their possible complementarities to binding sites of the target protein(Kitchen et al. 2004).

Preparation of in-house database of bioactive phytocomponents

Plants are an invaluable resource of traditional remedies since ancient times and an inspiration for the development of therapeutic agents (Mishra and Tiwari 2011). Phytochemicals with antioxidant properties and nutrient protectors prevent the production of carcinogens (cancer-causing agents) in the body. Phytochemicals have advantages over chemicals, such as increasing immunity, reducing inflammation, preventing DNA damage, and facilitating DNA repair, thereby slowing the growth of cancer cells. The Phytochemicals are safe, low-toxic, have universal availability, and ability to synergize with chemotherapy and radiotherapy. Curcumin is the principal constituent of turmeric with various therapeutic medicinal attributes. Curcumin (diferuloylmethane) is a natural polyphenol (1,7-bis(4-hydroxy-3-methoxyphenyl)-1,6-heptadiene-3,5-dione) with three common pharmacophore moieties in its structure. They are two phenolic moieties and an α,β-unsaturated keto-enol system (seven-carbon skeleton linking both the phenolic groups). These pharmacophoric features are significant for interaction with biological macromolecules. Additionally, the seven-carbon skeleton is flexible to adopt suitably for maximum intermolecular interactions (Nelson et al. 2017; Gramatica 2020). The In-house database of ligands include the compounds from indigenous plants of India and Taiwan with proven anticancer, antiplatelet, or antituberculosis potential (Lin et al. 2013). There are two significant species of Curcumin i.e., Curcuma Longa and Curcuma Zedoaria, which contain a total of 61 bioactive Phyto components and their structures are retrieved from the Pub Chem database (https://pubchem.ncbi.nlm.nih.gov/) and TIPdb Database (Kim et al. 2021). The structural features of Curcumin are shown in Fig. S1 (Supplementary Information). The ligands were converted to SDF file format, an input in PyRx software. (http://PyRx.sourceforge.net/) (Dallakyan and Olson 2015).

Grid generation and docking

The in-house library of ligands is optimized with MMFF94 and CHEMICAL force fields of the Open Babel software. The Conjugate Gradient and Steepest Descent optimization algorithms (50,000 steps with the 0.001 kcal/mol/Angstrom convergence criterion) are used for energy minimization. Finally, the option 'all ligand' was used to convert the minimized files to PDBQT format to produce their atomic coordinates for docking. The target KLK-14 protein grid box was set at the coordinates Centre x = − 3.2933, Centre y = − 1.3572, and Centre z = 25.2476 (Zaveri et al. 2015). The Discovery Studio 3.5 is used for the visualization of the interactions' in the best-ranked binding pose (Sussman et al. 1998).

Binding free energy calculation

The best ranked binding poses are submitted to PRODIGY-LIG, a prediction binding free energy calculation server for protein-small ligand complexes. (Kurkcuoglu et al. 2018). PRODIGY-LIG has the advantage of being simple, generic and applicable to any kind of protein–ligand complex. It provides an automatic, fast and user-friendly tool ensuring broad accessibility. PRODIGY -LIG performs a refinement of the interfaces through HADDOCK (in-house docking software) (van Zundert and Bonvin, 2014) to extract the inter-molecular electrostatics energy, the type and number of intermolecular atomic contacts (ACs) (within a 10.5 A˚

distance cutoff) and classified according to the atoms involved in the interaction (C = Carbon, O = Oxygen, N = Nitrogen and X = All other atoms). A combination of structural- and energy-based terms was used to train multiple linear regression models. The binding affinity predictor models ∆Gscore and ∆Gprediction for ranking ligands and predict the affinity as follows:

$$\begin{aligned} \Delta {\text{G}}_{{{\text{noelect}}}} & = \,0.0{3547}0{7 }*{\text{ AC}}_{{{\text{NN}}}} {-}0.{1277895 }\\& \quad *{\text{ AC}}_{{{\text{CC}}}} {-}0.00{72166}*{\text{ AC}}_{{{\text{CN}}}}{-}{5}.{1923181}. \end{aligned}$$

To adapt quick yet robust predictive models for Physicochemical properties, pharmacokinetics, drug-like, and drug–chemical friendliness, the latest web-based SWISS ADME is used, which also include in-house expert methods such as BOILEDEgg, iLOGP, and Bioavailability Radar (Daina and Zoete 2016). Based on the Physicochemical properties (Swiss ADME), binding score, RMSD and visual inspection, the best-pose of the docked ligands is recognized as the new potent lead molecule against the KLK-14 protein inhibition.

Results and discussion

The present study treat KLK-14 as a novel target protein for the identification of new leads as drug candidates for the inhibition of pathological angiogenesis. The 3D model of target protein analysis, various validation techniques were used. The active site residues were confirmed by protein–protein docking. Further to identify a best docked phyto chemical virtual screening was performed.

Homology modelling of KLK-14

The fasta sequence of the KLK-14 target protein with an amino acid length of 267 amino acids is retrieved from the ExPASy server, and a template search was carried out on servers, such as NCBI–Blast, HHPred and JPred. Position-Specific Iterative Basic Local Alignment Search Tool (PSI-BLAST), a tool used to identify a template, a homologous amino acid sequence with the KLK-14 protein sequence (Johnson et al. 2008). BLAST program uses a Stochastic model to identify a template with the closest identity to the target protein (Karlin and Altschul 1990). The low E-value of 5MS3-A depicts a significant level of biological correlation with the KLK-14 amino acid sequence.

The Jpred4 server tool identified template proteins with solvent accessibility, homologous secondary structure prediction, and coiled-coil region prediction by implementing multiple sequence alignment profiles. Using the JNet algorithm, Jpred4 predicts the secondary structural elements (α-helices, β-sheets, and loops) for the proteins deposited in the RCSPDB (Drozdetskiy et al. 2015). Jpred4 server identified 5MS3-A as a template of the KLK-14. HHpred implements a pair wise comparison of profile hidden Markov models (HMMs) (Soding et al. 2005). Based on the HMM–HMM comparison with a E-value = 9.87e–79, the template structure predicted is 5MS3-A.

The results obtained from the NCBI-BLASTp, JPred4 and HHpred servers are represented in Table 1. Percent identity of bases that are identical to the reference sequence and query sequence with a low percent identity can still be a true hit. The E -value is critical to look for homology between conserved regions (Kerfeld 2011). The template shows a query coverage of 85%, percent identity of 49.57% and on the basis of lowest E-Score, PDB ID: 5MS3-A is selected as a template protein (retrieved from the RSC protein data bank) to build a reliable model for the KLK-14 protein.

Table 1 Template selection for KLK-14 protein

Creation of a reliable 3D structure, the prerequisite is to reliably align KLK-14 with its phylogenetically related protein sequence, 5MS3-A. Figure 2 depicts pair-wise alignment of the KLK-14 protein sequence with the template protein sequence using CLUSTAL W software, which shows a 42.69% similarity between the alignment of target and template sequence.

Fig. 2
figure 2

Pair wise Alignment of KLK-14 with template protein was carried out with CLUSTAL W and visualized in Discovery Studios 3.5. Representation of the conserved amino acid residues with pink color, strongly similar with green color, Weakly Similar with yellow color and blue color for Diversity

Modeller 9v9 was used to initially generate 200 models of which model with the lowest modeler objective function is considered for further refinement. The 3D model is further refined by loop modeling and energy minimization using Swiss PDB Viewer 4.1.0. (Kaplan 2001). Table 2 shows the MD simulation refinement data of KLK-14 as obtained from locPREFMD web tool and overall refined factors after the simulations. The generally acceptable range of the RMSD 2.5 Angstrom for insilico model (Hui-Hsu et al. 2004). The RMSD after MD Simulations for main chain < 1.033 Å and Cα-RMSD < 0.980 Å (Koichi et al. 2017). The 3D Homology Model of KLK-14 Protein showing secondary structure of 14 strands and 3 helices as shown in Fig. 3.

Table 2 Molecular dynamics parameters of the stabilized structure of KLK-14
Fig. 3
figure 3

Three Dimensional (3D) Structure of KLK-14 protein

Structural validation

Figure 4 depicts the Ramachandran plot (RC Plot) as obtained from the PROCHECK server, represents the stereochemical quality of the 3D model of KLK-14 protein. The RC plot of the refined 3D structure of KLK-14 protein shows that 90.80% of the amino acid residues are in the energetically favored region (red), 10.80% in the additionally allowed region (yellow), none of the residues in the generously allowed region (Light Yellow) and disallowed region indicate that the predicted 3D model of KLK-14 protein is a stereo chemically a good quality model.

Fig. 4
figure 4

Ramachandran Plot of the KLK-14 protein

The ProSA server represents the overall and local model quality of the 3D model of KLK-14 by comparing with the similar amino acid length proteins deposited in the RCSPDB. Figure 5A depicts the ProSA plot for the overall folding energy of the protein KLK-14 is negative (z score = − 6.18) (Seen as a dark spot in the light blue region) indicates that the 3D model is very close to the experimentally determined X-ray structures. Figure 5B illustrates the ProSA energy profile of the local protein model using knowledge-based energy values. The ProSA II plot predicts the folding protein energy as a function of the amino acid sequence. The low energy profile that distinguishes the quality of the local KLK-14 model is reliable.

Fig. 5
figure 5

A ProSA server results of KLK-14 protein. A ProSA Plot for the KLK-14 protein (protein is seen as Black dot) Z score = − 6.18. B Knowledge-based energy profile of amino acids with a window size of 10 amino acids (light green) and a window size of 40 amino acids (dark green)

The secondary structure of the 3D model of KLK-14 protein as obtained from the PDB sum server indicates the presence of 3 helices, 13 beta-sheets, and 7 beta hairpins as shown in Fig. 6. Further analysis reveals, the three (3) helices are in the amino acid range: (a) α- helix H1—from Ala82 to Cys84, (b) helix H2—from Asp188 to Ala194, and (c) helix H3—from Leu257 to Lys263.

Fig. 6
figure 6

Secondary structure elements of KLK-14 protein. The secondary structure as obtained in PDB Sum for KLK-14

The KLK-14 protein secondary structure shows 3α helices, 13 β sheets and 6 disulphide bonds. The thirteen (13) beta-sheets are among the amino acids, Gln55–Leu59, Ala71–Ser74, Trp77–Thr80, Gln90–Leu93, Val107–Thr115, Met129–Leu133, Glu147–Val148, Ser158–Gly163, Gln179–Ile185, Met203–Gly207, Pro203–Cys226, Gln229–Trp236 and Gly248–Asn252. The beta-hairpin loops from Ser74–Trp77, Leu93–Val106, Thr115–Met129, Gly163–Gln179, Ile185–Met203, Cys226–Gln229, Trp236–Gly248. Besides, the 3D model contains 6 disulphide bonds which play a highly significant role in protein folding and stability. The disulphide bond interactions are between Cys47–Cys180, Cys68–Cys84, Cys152–Cys254, Cys159–Cys226, Cys191–Cys205 and Cys216–Cys241 and are believed to enhance the stability of the 3D model of KLK-14 protein. The kallikrein family of proteins like KLK 5 & KLK 7 is also reported to have a similar kind of secondary structure.

Active site identification and docking

The results of the NCBI-BLASTp server shown as a pictorial representation and the target KLK-14 protein has a Trypsin as serine protease (Tryp SPc) domain from amino acid 41 to 267 as shown in Fig. S2 (Supplementary Information). The study of the two interfaces of the protein–protein complex of KLK-14—fibronectin provides specific details of the binding site region in KLK-14, responsible for activating the MMPs for angiogenesis.

The KLK-14 binding pockets are involved in the interaction with its natural signal transduction receptor as identified from the CASTp server and the active site prediction server. Results of Castp server show that the region from the amino acids Gln 35 to Lys 267 holds a large hydrophilic pocket of the protein KLK-14 and also a small region of three amino acids Gln150, Arg227 and Gln 229. Both the servers identified the existence of two binding pockets in the same conserved area, whose volume and amino acid residues are given in Table 3.

Table 3 Amino acid residues in the binding pockets of KLK-14 protein. Amino acid residues in the binding pockets of KLK-14 protein as obtained from CASTp and active site prediction servers with its volume

As per the literature, the active site of the KLK-14 protein has a catalytic triad of His83, Asp127, and Ser220 amino acids for substrate specific domain with few amino acid residues like Arginine/Lysine and Tyrosine/Tryptophan/Phenylalanine are contributing to the trypsin-like and chymotrypsin-like substrate specificity, respectively. Further to identify the specific amino acid residues involved in binding, the Protein–Protein Interaction (PPI) studies were conducted between KLK-14 and Fibronectin using the Patch Dock server.

The results obtained from Patch Dock server as a set of scoring functions based on the shape complementarity and the atomic de-solvation energy of the transformed complex. Table 4 shows interaction in the docked complex of KLK-14 & fibronectin proteins as one (1) pi-cation and eleven (11) hydrogen bond interactions.

Table 4 Docking interactions of Fibronectin and KLK-14 protein

As described in the literature, His83, Asp127, and Ser220 have been preserved in KLK -14. Residues of the catalytic triad. Protein–Protein interaction (PPI) results show that His83 and Ser 220 of the catalytic triad are involved in binding interactions with fibronectin. Corroborating with the results of the PPI study and active site prediction servers, it is inferred that the region from amino acids from Arg 65 to Glu 239 is the active site and identified amino acid residues that are significant for binding are His 83, Arg 86, Ser 121, Tyr 174, Gln 217, Gly 218, Ser 220 and Glu 239 amino acids responsible for Angiogenic signaling.

Virtual screening of KLK-14 protein

Virtual screening is carried out by performing the docking in ligand centered map generated by Auto Grid program with a spacing of 0.375 Å and grid dimensions of 40 × 48 × 40 Å3. Grid box center was set to coordinate − 3.2933, − 1.3572 and 25.2476 in x, y, and z, respectively, as shown in Fig. S3 (Supplementary Information). The generated docking poses of the ligands are based on the Vina Empirical Score function approximating the binding affinity in kcal/mol.

Table 5 shows the top ten phyto chemicals that bind with KLK-14 protein are identified using Virtual screening and analysis of the docking results indicates that interactions in KLK-14-ligand complexes show a good affinity to Curcumin ligands with acceptable binding energy. The binding interactions such as hydrogen bond interactions, associated interactions were visualized using Accelrys Discovery Studio 3.5 and shown in Fig.S4 (Supplementary Information). Hydrogen bond interactions with the α, β unsaturated carbonyl group and the phenolic group of Curcumin derivatives have been identified. Extensive interactions such as π interactions like π-π, π-sigma & π-cation with aromatic rings and sp3 hybridized asymmetrical carbon are also found.

Table 5 Binding energy, Binding Free Energy and Interactions of Phyto compounds with KLK-14

Ligands L1 to L10 are α, β unsaturated carbonyl scaffolds except for L1 and L5. The ligand, L1 has dihydronaptho furan moiety, ligand L5 has oxa tricyclo derivatives with alcoholic pharmacophore moiety. Ligands L2, L3, L4 & L6 have carbonyl, phenolic and aromatic rings as a group of pharmacophore. The IUPAC names of ligand molecules with their binding energies are shown in Table S1 (Supplementary Information). Table 5 represents the binding interactions of Curcumin and its derivatives with KLK-14, these scaffolds may be considered for further development of new leads against pathogenic angiogenesis and cancer.

Ligand molecules exhibit hydrogen bond and π-π, π-Cation and π-Sigma interactions with the amino acids Arg65, Phe66, Leu67, His83, Tyr174, and Ser220 of the conserved domain of KLK-14. All these non-covalent interactions in docked complexes are significant since they involve the active catalytic triad residues and dual-specificity have been confirmed. In addition to hydrogen bond interactions, π interactions give additional stability to the KLK-14 protein-Curcumin ligand complex. Several scientists have also used similar protocols to discover novel leads (Chang et al. 2010; Cerqueira et al. 2015; Vadija et al. 2016; Vellanki et al. 2018).

The binding free energy calculations were performed the top ranked compounds and reevaluated using PRODIGY-LIG server. On the basis of binding free energy, the reevaluation reveals that the Curcumin and BisDemethoxycurcumin show higher affinity toward the KLK-14.

ADMET analysis

ADMET assessment is a key stage of the drug development process prior to pre-clinical Evaluation. On the basis of Physicochemical properties, ADME is evaluated for established L1-L10 Curcumin derivatives showing affinity to KLK-14. Table 6 shows that all Curcumin ligand molecules have a permissible and acceptable pharmacokinetic spectrum of values.

Table 6 Predicted ADME of the docked molecules

Fig. S5 (Supplementary Information) illustrates the Bioavailability Radar plots for the L1–L10 displaying rapid appraisal of drug-likeness. Bioavailability plot takes into account six physicochemical properties: Lipophilicity, size, polarity, solubility, flexibility and saturation. A physicochemical range on each axis was defined by descriptors adapted and depicted as a pink area in which the radar plot of the molecule has to fall entirely to be considered drug-like (Lovering et al. 2009; Ritchie et al. 2011).The pink area represents the optimal range for each properties (Lipophilicity: XLOGP3 between − 0.7 and + 5.0, size: MW between 150 and 500 g/mol, polarity: TPSA between 20 and 130 Å2, solubility: log S not higher than 6, saturation: fraction of carbons in the sp3 hybridization not less than 0.25, and flexibility: no more than 9 rotatable bonds. L1, L3, L4, L5, L7, L8, L9, L10 are predicted to be not orally bioavailable, because too flexible and too polar

Figure 7 is a predictive model, the BOILED-Egg for the gastrointestinal and BBB prediction. The model plot is based on TPSA vs. logp, shows the white region is the physicochemical space of molecules with the highest probability of being absorbed by the gastrointestinal tract, and the yellow region (yolk) is the physicochemical space of molecules with the highest probability to permeate to the brain. Some of the phyto components are out of the bioavailability zone, whereas L1, L3, L4, L5, L6 and L7, L10 cross the BBB hence are not suitable for the non-CNS category drugs(Tian et al. 2015), while L2 and L8 are hydrophilic and are suitable for the oral category of the drugs. Substrates of P-glycoprotein (PGA) are susceptible to changes in pharmacokinetics due to drug interactions with P-gp inhibitors or inducers. PGA overexpression is one of the main mechanisms behind decreased intracellular drug accumulation in various cancers (Breier et al. 2013).

Fig. 7
figure 7

BOILED-Egg predictive model

ADMET assessment before a pre-clinical trial is a critical stage of the drug development process. The ADME is evaluated based on physicochemical properties for the established L1–L10 ligands that show affinity to KLK-14. Lipophilicity (logp) is a measure of efficient drug transport, and high Lipophilicity (0.09–5.5) and the ability to pass passively through the transcellular route are present in all ligands in this study. Inadequate levels of a drug that passes through the blood–brain barrier (BBB) can lead to central nervous system (CNS) adverse effects (Betsholtz 2014) and the polar surface area is a critical parameter for BBB. Polar surface area values are within an acceptable range for all known ligands, i.e., ~ 120(Å), and have strong overall ADME properties.

The result shows that the L2 (Curcumin) and L8 (Zedoarolide A) ligands have drug-like properties reported from virtual screening have synthetic viability compared to existing drugs, and can therefore be considered for the design of new KLK-14 protein inhibitors and scaffolds may be considered for further development of new leads against pathogenic angiogenesis and cancer.

Conclusions

The 3D structure of KLK-14 protein generated using 5MS3 as a template is a good quality model comparable to the X-Ray resolved protein structure. Protein–Protein docking of KLK-14 with Fibronectin gave an insight into the residues of binding in the active site region of KLK-14 protein. The amino acid residues Arg65, Phe66, Leu67, His83, Tyr174 & Ser220 of KLK-14 protein have putative binding interactions with these ligand molecules. Virtual Screening Studies and Binding free energy calculations identified Curcumin and BisDemethoxycurcumin scaffolds (∆Gnoelect, Binding Free Energy = − 9.4 & − 9.3 kcal/mol) with acceptable ADME will inhibit KLK-14 protein by blocking the catalytic domain (triad residues Ser220 and His83) and act as lead scaffolds for the design of inhibitor of pathological angiogenesis in cancer.