Introduction

Gastric cancer ranks as the fourth leading cause of death and the fifth major cancer in the world [1, 2]. GC is a serious threat to general health all over the world and is one of the most common gastrointestinal malignancies in East Asia [3,4,5]. Due to the lack of early detection markers, GC is usually possible to be diagnosed in the final stages, which is associated with a lack of effective treatment strategies that lead to poor prognostication. Surgery is considered the most effective way to treat GC in the early stages of the disease, although chemotherapy and radiotherapy have been used to treat GC-related death, with a 5-year survival rate below 25% [6,7,8].

Tumor marker analysis is an important tool for cancer prevention. Due to important roles of proteins involved at both the cellular and molecular level, proteomics knowledge has been used in research into various types of cancer, for example, gastric cancer to search for new cancer markers and drug targets [9]. Many studies to date have attempted to identify markers that are effective in GC such as the fibulin-5 [10], nicotinamide N-methyltransferase (NNMT) [10, 11], ANXA1 [10], UQCRC1 [10], Her-2 [10, 12, 13], EGFR [14, 15], carcinoembryonic antigen (CEA) [16], alpha-fetoprotein (AFP) [16], carbohydrate antigen (CA) [16], VEGF [17], c-SRC [18], HGF/MET [19, 20], cancer antigen 19‑9 [21, 22], erb‑b2 receptor tyrosine kinase 2 [21], and E-cadherin [6, 23].

Nicotinamide N-methyltransferase (NNMT) protein is the enzyme that catalyzes the N-methylation reaction of nicotinamide, and high NNMT expression has been reported in the GC tissues. This abnormal expression reveals that NNMT is a potential prognostic biomarker and molecular therapeutic target in early and advanced GC [24, 25]. C-Src is a sub-group of nonreceptor protein tyrosine kinases (SFK) with key roles in intracellular signaling. Upregulation of this factor has been found in gastric cancer with the promotion of cancer cell proliferation and metastasis. Src is considered as a promising therapeutic target for the treatment of gastric cancer [26, 27]. Cadherin-1 or E-cadherin (CDH1) belongs to the cadherin family, membrane proteins, which play role in maintaining cell membrane integrity. Loss or downregulation of E-cadherin has been displayed in GC and this alteration may be considered an effective therapeutic approach for GC treatment [28, 29].

Identifying potential targets for metastasis attenuation and developing strong therapeutic drugs is essential for the effective treatment of GC [30]. The main goal and reason of cancer experiments are controlling the cell cycle by inducing cell death by activating a cell cycle blocker or activating apoptosis [31]. Induction of apoptosis in target cells is a key mechanism that should be considered in anti-cancer drug activity tests. An important strategy in the prevention of chemotherapy and also the use of natural compounds is the actuation of apoptotic pathways by inhibition of anti-apoptotic BLC-2 family or activating TRAIL death receptors [32].

Despite the development of therapies used in the treatment of cancer, due to the worrying side effect of drugs available in the market, a wider platform is provided for researchers to study traditional medicinal plants, given the fact that only 1% of 500,000 (about five thousand) plants known to date have been studied, increase the need to discover bioactive drug compositions [31]. In the development of effective drugs with low toxicity to inhibit tumor recurrence and metastasis, natural and biologically active products are widely used in clinical and basic research. Currently, plant-derived anti-cancer drugs that are used clinically include vinblastine, vincristine, paclitaxel, and camptothecin. Due to the diverse source of medical plants, much research has been done on screening natural compounds as molecular targets for cancer prevention which has led to the discovery of anti-cancer agents [32, 33].

There are many reports of a high effect of apoptosis on gastric cancer cell lines, such as saffron [34], curcumin [35,36,37], quercetin [38, 39], carvacrol [39], berberine [40, 41], gallic acid [42], resveratrol [43, 44], salidroside [45], oleanolic acid [46, 47], anthocyanins [48], stilbenes [32], 6-gingerole [49], ellagic acid [50], and β-sitosterol [50]. These compounds have biological and pharmacological properties including anti-inflammatory, antioxidant, antibacterial, anti-cancer, and anti-growth and they affect many cancer cell lines including gastric cancer by inducing apoptosis and suppressing the proliferation and invasion of cancer cells.

The current project aims to discover new natural anti-gastric cancer compounds in the treatment of gastric cancer. For this aim, a ligand-based pharmacophore hypothesis was generated and a 3D-QSAR model was performed to find common features that can be used to predict the biological properties of ligands, and by using these common features to connect ligands structures and their activities with predicted pIC50. Then the binding of these active compounds to amino acid residues in protein receptors that play an important role in the mechanism of gastric cancer was investigated by molecular docking. In this context, a natural compounds database was created, and by using pharmacophore generation and the 3D-QSAR model, the virtual screening, molecular docking, and molecular dynamic are performed. Ultimately, twenty-one lead compounds were selected from this database.

Methods

Protein preparation

In this study, three protein biomarkers that had an effective and key role in gastric cancer and drug mechanism of action in previously published articles were selected as ligand-protein receptors and their crystal structures were downloaded from the RCSB Protein Data Bank (PDB) (https://www.rcsb.org/pdb). In addition, a list of genes involved in gastric cancer was prepared using previously published articles; then, by determining these genes related to the protein-coding category using the GeneCards database (https://www.genecards.org/), and the identification of these proteins by the UniProt site (https://www.uniprot.org/) was confirmed the same protein biomarkers involved in gastric cancer. The selected structures include PDB IDs 7BKG, 4F5B, and 4ZT1. The structure of these proteins was prepared by using a protein preparation wizard (Maestro version 12.5, 2020). Thus, the addition of hydrogen atoms, the creation of disulfide bonds, the deleting of water molecules beyond 3.00 Å from HET groups, generating HET states using Epik (pH 7 ± 2), filling Missing side chains and loops using prime, and then optimization and minimization using the OPLS3e force field were performed.

Ligand preparation

To convert 2D to 3D structures, the LigPrep module of Maestro version 12.5 (Schrödinger, LLC, New York,) was used, and the settings including OPLS3 force field, ionization states using Epik (pH 7 ± 2) and generating 4 isomers at most per ligand was performed. A database with over 183,885 structures was created. For creating natural compounds database, compounds were downloaded from the AnalytiCon Discovery database (https://www.ac-discovery.com), IBScreenNP database (https://www.ibscreen.com/naturalcompounds), SpecNatural database (https://www.specs.net), and Zinc15 database (http://www.zinc.docking.org/browse/catalogs/naturalproducts).

Developing a pharmacophore model

The PHASE module of Maestro version 12.5 (Schrödinger LLC New York) was used for pharmacophore modeling. A common pharmacophore hypothesis is created by the placement of several pharmacophoric features together that indicates the major binding interactions between the active ligand and the receptor [51]. Pharmacophore features include a hydrogen bond acceptor (A), a hydrogen bond donor (D), a hydrophobic group (H), a negatively charged group (N), a positively charged group (P), and an aromatic ring (R). In the present study, a set of 50 diverse structures with anti-gastric cancer effects were collected from previously published reports (7 of these 50 ligands contain chemotherapeutic drugs used to treat gastric cancer and are FDA approved) with the reported amount of IC50 (The half-maximal inhibitory concentration). Using the following formula, IC50 values were converted to pIC50.

$${\mathrm{pIC}}_{50}=-\log10\;\;{\mathrm{IC}}_{50}$$

In this study IC50 is the nanomolar (or micromolar) concentration of half maximal inhibitory. All the chemical structures of these ligands and their IC50 values are shown in Table S1 in supplementary data. The Pharm Set column was divided into active and inactive groups, taking into threshold range of pIC50 ≥ 6.5 for active and < 5.5 for inactive. However, only all active ligand conformations are involved in the formation of pharmacophore hypotheses.

Pharmacophore model validation

The validity and significance of pharmacophore models were validated with statistical parameters. In this study evaluating the quality of the pharmacophore model and enrichment calculations were performed using the PHASE application. The validation set consists of two groups, the decoy set (A dataset of 1,000 drug compounds with 400 molecular weight Da (http://www.schrodinger.com/glide_decoy_set)) and the active set (Contains 70 known compounds with anti-cancer effect) that were used for validation. The studied parameters include the enrichment factor, robust initial enhancement, Boltzmann-enhanced discrimination of receiver operating characteristic, and the goodness of hit.

Building a pharmacophore-based 3D QSAR model

The PHASE module of Maestro version 12.5 (Schrödinger) was used for building 3D-QSAR models. Models that link molecular descriptors and encode molecular structure information to the target property of molecules are quantitative structure-activity relationships (QSARs) [52]. There are several methods for quantifying the relationship between structure and activity, one of the most important of which can be partial least-squares regression (PLS) [52]. The most important purpose of creating the QSAR model is to predict the biological activity of new structures. A QSAR model can be performed in two forms: atom-based or pharmacophore-based. In the atom-based model, all atoms are deemed in the entire structure of the molecules, but in the pharmacophore-based model, only the pharmacophoric features that can be matched to the hypothesis are considered. The first model is suitable for congeneric series ligands and the second model is suitable for diverse series ligands that have more flexibility [53, 54]. To create the QSAR model, a set of 50 diverse structures, which have the reported amount of pIC50, were randomly divided into two groups, training set, and test set, considering 80% of the training structures. The training set is used to create the QSAR model and the test set is used to validate the created model. The QSAR model must be validated both internally and externally [51]. Externally validation is performed using the predicted activities of the test set compounds. Internally validation of pharmacophoric hypotheses is performed with statistical parameters including correlation coefficient (R2), cross-validation regression coefficient (q2), the standard deviation of regression (SD), statistical significance (P), and variance ratio (F) [52, 55]. The cross-validation regression coefficient was calculated by two factors, the prediction error sum of squares (PRESS) and the sum of squares of deviation of the experimental values from their mean (SSY), according to the following equation:

$${\mathrm{q}}^{2}=1-\frac{\mathrm{p}\mathrm{r}\mathrm{e}\mathrm{s}\mathrm{s}}{\mathrm{s}\mathrm{s}\mathrm{y}}=1-\frac{{\sum }_{\mathrm{i}=1}^{\mathrm{n}}\left({\mathrm{Y}}_{\mathrm{e}\mathrm{x}\mathrm{p}}-{\mathrm{Y}}_{\mathrm{p}\mathrm{r}\mathrm{e}\mathrm{d}}\right)2}{{\sum }_{\mathrm{i}=1}^{\mathrm{n}}\left({\mathrm{Y}}_{\mathrm{e}\mathrm{x}\mathrm{p}}-{\mathrm{Y}}_{\mathrm{m}\mathrm{e}\mathrm{a}\mathrm{n}}\right)2}$$

where Yexp, Ypred and Ymean indicate the experimental activity of the training set compound, the predicted activity of the training set compound and the mean values of the activity of training set compound, respectively [55]. Also, the efficiency of the model was validated by the determination of the coefficient in prediction (r2 test), according to the following equation:

$${\mathrm{r}}^{2}\mathrm{t}\mathrm{e}\mathrm{s}\mathrm{t}=1-\frac{{\sum }_{\mathrm{i}=1}^{\mathrm{n}}\left({\mathrm{Y}}_{\mathrm{p}\mathrm{r}\mathrm{e}\mathrm{d}\mathrm{t}\mathrm{e}\mathrm{s}\mathrm{t}}-{\mathrm{Y}}_{\mathrm{t}\mathrm{e}\mathrm{s}\mathrm{t}}\right)2}{{\sum }_{\mathrm{i}=1}^{\mathrm{n}}\left({\mathrm{Y}}_{\mathrm{t}\mathrm{e}\mathrm{s}\mathrm{t}}-{\mathrm{Y}}_{\mathrm{m}\mathrm{e}\mathrm{a}\mathrm{n}}\right)2}$$

The Ypredtest, Ytest and Ymean show the predicted activity of the test set compound, the experimental activity of the test set compound and the mean values of the activity of test set compound, respectively [55].

Ligand-based virtual screening

The PHASE module of Maestro version 12.5 was used for ligand and database screening. This module with the best pharmacophores matching, creates a 3D database of hit compounds with the best fitness scoring and keeps an inter-site distance matching tolerance of 2.0 Å to study molecular docking against three protein receptors. For hypotheses with 5 sites, the compounds must be matched at least 4 sites. The fitness score represents the alignment of these compounds on the selected hypotheses. The range of fitness score is 0 to 3 and the score of 3 indicates the most ligand alignment on the hypothesis.

Molecular docking

Glide application of Maestro version 12.5 (Schrödinger) was used to study the docking between prepared protein receptors and ligands. For this purpose, in the receptor grid generation panel, the active binding site of protein was created with a dimension of 20 Å to interact with the ligands. To generate an active site for two protein structures with PDB IDs 7BKG and 4F5B used their native ligand and for protein structure with PDB ID: 4ZT1, the site map application was used with the highest score. Grid box dimensions are shown in Table 1. The ligands docking was performed first by the high throughput virtual screening (HTVS) method and then the first 100 of the compounds by extra precision (XP) method with flexible docking and keeping 10% of the best compound after docking.

Table 1 Grid box dimensions of the three receptors

ADME and molecular properties (absorption, distribution, metabolism, and excretion)

QikProp application of Maestro version 12.5 (Schrödinger) was used to study the physicochemical properties and Drug-likeness calculations of all hit compounds by applying Lipinski’s rule of five, central nervous system activity (CNS), human oral absorption (PCaco), predicted brain/blood partition coefficient (logBB) and polar surface area (PSA).

Molecular dynamics (MD) simulation

MD simulations were applied for top-scoring ligand-receptor complexes to investigate the ligand-receptor interactions and to confirm their stability. Ligand-receptor complexes selected from docking calculations were moved to molecular dynamics simulations by using GROMACS software. Ligand preparation was done using Swiss Param web server by CHARMM force field. All systems were solvated in a triclinic box with TIP3P water molecules as shown in Fig. S1 in supplementary data. Energy minimization by using SD algorithm for 1 ns, equilibration of system by using NVT and NPT ensembles by maintaining temperature at 300 K and pressure 1 bar, Figs. S2 and S3, were done and a total production run, was completed during 100 ns. Finally, by using VMD and Tecplot, gain trajectories were analyzed.

Results and discussion

The generation of pharmacophore and 3D-QSAR model

Many studies have investigated the anti-gastric cancer effect of natural compounds either in vitro on gastric cancer cell lines or in silico. Among these, we can refer to natural compounds such as saffron [34], curcumin [36], quercetin [38], gallic acid [42], carvacrol [39], and anthocyanins [48]. Also, in the other study, the inhibitory effect of kaempferol on Jack bean urease with a highlighted role in creating gastric cancer has been examined using docking and molecular dynamic (MD) simulation [56]. To the best of our knowledge, to date, few studies, or maybe no study has investigated the inhibitory effect on gastric cancer using virtual screening of a large number of compounds at the same time, pharmacophore modeling, and the 3D-QSAR.

The purpose of this study is to find new natural compounds with the best and most effective anti-gastric cancer properties. So, pharmacophore generation, 3D-QSAR, virtual screening, molecular docking, and molecular dynamic are used to discover these compounds. At first, 183,885 hit compounds were investigated for ADME and physicochemical properties and then 141,173 compounds were selected in ligand and database screening step to be matched with the best pharmacophoric hypotheses. Then 1000 compounds after matching the hypotheses were obtained in the ligand and database screening step which was used for the virtual screening workflow step. Common pharmacophore hypotheses have been created of a set of 23 active ligands in the Pharm Set column that have the maximum and most important structural features required to interact with protein receptors. For this purpose, to match the hypothesis with the active ligands we considered at least 50% matching and the minimum site to be 5 and a maximum site to be 6 to optimum the best feature for creating pharmacophore hypotheses. Eventually, 10 five-feature pharmacophore hypotheses were developed, and the three best pharmacophore hypotheses with the most survival score, site score, vector score, and volume score were chosen (Fig. 1; Table 2).

Fig. 1
figure 1

Three best five-feature pharmacophore hypotheses with the distance of pharmacophoric hypothesis features. A AARRR-1, B AARRR-2, C AARRR-3. Note:(A), hydrogen bond acceptors (Pink sphere with arrows); (R), aromatic ring (yellow open circle) and all distances are in Å units

Table 2 Three best pharmacophore hypotheses with their Scores

A good pharmacophoric hypothesis can discriminate between active and inactive ligands. Here the best-developed hypothesis is AARRR, which shows the two groups of the hydrogen acceptor (AA) and three groups of the aromatic ring (RRR) come together. The distance of pharmacophoric hypotheses features is shown in Table 3.

Table 3 Distance of pharmacophoric hypothesis features

Before the virtual screening, generated pharmacophore hypotheses should be validated using the enrichment factor. The results are shown in Table 4. According to these results, the pharmacophore hypothesis AARRR-2 has shown the highest EF1%, BEDROC, and ROC, which indicates that the prediction ability of hypothesis AARRR-2 is more compared to other hypotheses. Figure 2 shows the active and inactive ligands alignment on pharmacophoric hypothesis AARRR-2.

Table 4 Validation of hypothesis features
Fig. 2
figure 2

The alignment of a all active and b all inactive ligands and c most active and d most inactive ligand on the best pharmacophoric hypothesis AARRR-2

The QSAR models were created for three of the top-ranked hypotheses using the atom-based partial least square regression (PLS) method. To generate good 3D-QSAR models, QSAR must be validated. Internal validation of three pharmacophore hypotheses was performed using statistical parameters based on PLS calculations. The statistical parameters of the developed 3D-QSAR models for three of the best pharmacophore hypotheses are shown in Table 5. Although the high value of R2 (squared correlation coefficient), the low value of SD (Standard deviation), the high value of F (variance ratio), and the lowest value of RMSE (root-mean-square error) are the hallmarks of all three hypotheses, however, hypothesis 2 (AARRR-2) has a better prediction ability than the other two. Hypothesis 2 (AARRR-2) showed the value of R2 = 0.94, F = 242, SD = 0.34 and RMSE = 0.98. These statistical parameters indicate the robustness of the developed 3D-QSAR model and pharmacophoric hypothesis. The Scatter plot of the actual and predicted biological activity of the training and the test sets is shown in Fig. 3. It reveals the linear regression model of predicted pIC50 values versus the real activity of training and test sets for the third PLS factor. The effectiveness of the model was determined from the calculated correlation coefficient and Q2 for the randomly selected test set. Therefore, it confirmed the selected model has a good predictive ability.

Table 5 Statistical parameters of the developed 3D-QSAR.
Fig. 3
figure 3

Scatter plot of actual and predicted biological activity of the training and the test set

The validated hypothesis AARRR-2 obtained from the 3D-QSAR was used to generate the contour map. Contour maps can help understand the importance of functional groups at specific points in a biological activity pathway. These insights can be obtained by comparing the contour maps of ligands with the most and least activity. The results of the hydrogen-bond donor, negative ionic, and positive ionic contour map on the most and least active ligands are shown in Fig. 4. Blue and red cubes show favorable and unfavorable regions of hydrogen bond donor effect, respectively, while pink and green cubes indicate favorable and unfavorable regions of negative ionic effect, and purple and yellow cubes indicate favorable and unfavorable regions of positive ionic effect.

Fig. 4
figure 4

Hydrogen-bond donor effect of a least active and b most active (blue, favorable; red, unfavorable), negative ionic effect of c least active and d most active (pink, favorable; green, unfavorable), positive ionic effect of e least active and f most active (purple, favorable; yellow, unfavorable)

Molecular docking studies

Molecular docking studies were performed using the virtual screening workflow in the Glide application of Maestro version 12.5 (Schrödinger) to investigate the intermolecular interactions between the ligand and the receptor. At first, the HTVS (high throughput virtual screening) method was used for docking, which resulted in 254 compounds, that all of these ligands had a molecular weight of less than 500 g/mol, matched ligand sites above 4, and a fitness score above 1.8. This great fitness score indicates that the ligands are well-matched to the pharmacophoric hypotheses in the ligand and database screening step. These ligands were docked with nicotinamide N-methyltransferase (PDB ID: 7BKG), cadherin-1 (PDB ID: 4ZT1), and proto-oncogene tyrosine-protein kinase Src (PDB ID: 4F5B) receptors. Then, for further analysis, the first 100 compounds obtained from HTVS (with docking score higher than − 7 kcal/mol) were investigated with XP (extra precision) method. Finally, 21 compounds of Glide XP were obtained, and the docking score of these lead compounds was from − 13.366 to -6.404 kcal/mol. The docking score, fitness score, ΔG Bind, amino acid residues involved in the interaction, and predicted pIC50 using the QSAR model of lead compounds are listed in Table 6 and their 2D chemical structure is shown in Fig. 5.

Table 6 Docking score (kcal/mol), fitness score, ΔG Bind, an amino acid involved in the interaction and predicted pIC50
Fig. 5
figure 5

The 2D chemical structures of all the lead compounds are presented

The NA-1 compound with the highest docking score (-13.366 kcal/mol) showed the highest interaction with the nicotinamide N-methyltransferase (PDB ID: 7BKG) receptor compared to other compounds in Table 6. Analysis of this ligand docking results showed that the interactions between the ligand and the active site of the protein were hydrogen bonding and pi-pi stacking. 2D and 3D interactions between the NA-1 compound and nicotinamide N-methyltransferase (PDB ID: 7BKG) receptor are shown in Fig. 6. Important interactions include hydrogen bonding with amino acid residues ASN 90, LEU 164, and TYR 204 and pi-pi stacking with amino acid residues TYR 11 and TYR 204. Also, the study of the first 9 compounds in Table 6, which have docking score values of -13.366 to -10.207 kcal/mol and have the highest interaction with receptor nicotinamide N-methyltransferase (PDB ID: 7BKG) shows that the amino acid residues involved in interaction include TYR 11, ASP 85, ASN 90, VAL 143, LEU 164 and TYR 204.

Fig. 6
figure 6

The 2D (right) and 3D (left) receptor-ligand interaction of NA-1 compound with nicotinamide N-methyltransferase (PDB ID: 7BKG) active site. Important amino acid residues involved in the binding are shown in 2D and 3D interactions

Next, we used the native ligand (UOZ) for the nicotinamide N-methyltransferase (PDB ID: 7BKG) receptor as a control for molecular docking. This native ligand with the docking score values of -7.889 kcal/mol showed the interactions include hydrogen bonding with amino acid residue SER 213 and pi-pi stacking with amino acid residues TYR 204 and TYR 24. The comparison of native ligand and receptor interactions with our selected ligands shows the same active site and interactions.

The NA-10 compound with a docking score of -9.219 kcal/mol showed the highest interaction with the cadherin-1 (PDB ID: 4ZT1) receptor (Table 6). Analysis of the docking results of the NA-10 compound with cadherin-1 (PDB ID: 4ZT1) receptor showed that the important protein-ligand interactions include hydrogen bonding with amino acid residues LEU B:21, SER A:8, and SER B:8 and pi-pi stacking with amino acid residue TRP B:59. 2D and 3D interactions between the NA-10 compound and cadherin-1 (PDB ID: 4ZT1) receptor are shown in Fig. 7. Also, the study of the 8 compounds that have the most interaction with the cadherin-1 (PDB ID: 4ZT1) receptor (-9.219 to -7.695 kcal/mol docking score) shows that the interactions are hydrogen bonding and pi-pi stacking, and the amino acid residues involved in the interaction include SER A:8, SER B:8, LEU A:21, LEU B:21, PRO A:6, PRO B:6, and TRP B:59.

Fig. 7
figure 7

The 2D (right) and 3D (left) receptor-ligand interaction of NA-10 compound with cadherin-1 (PDB ID: 4ZT1) active site. Important amino acid residues involved in the binding are shown in 2D and 3D interactions

Finally, the NA-18 compound (docking score: -7.620 kcal/mol) showed the most interactions with proto-oncogene tyrosine-protein kinase Src (PDB ID: 4F5B) receptor (Table 6). The most important of these interactions include hydrogen bonding with amino acid residues ARG 158, THR 182, ASN 201, HIE 204, SER 180, and GLU 181 and and pi-cation with amino acid residue LYS 198. 2D and 3D interactions between the NA-18 compound and proto-oncogene tyrosine-protein kinase Src (PDB ID: 4F5B) receptor are shown in Fig. 8. The study of the last 4 compounds in Table 6 (-7.620 to -6.404 kcal/mol docking score) which had the most interaction with proto-oncogene tyrosine-protein kinase Src (PDB ID: 4F5B) receptor showed that the NA-19 to the NA-21 compounds had hydrogen bonding interactions. The amino acid residues involved in these protein-ligand interactions include amino acid residues ARG 158, ARG 178, SER 180, GLU 181, THR 182, ASN 201, HIE 204, and LEU 206.

Fig. 8
figure 8

The 2D (right) and 3D (left) receptor-ligand interaction of NA-18 compound with proto-oncogene tyrosine-protein kinase Src (PDB ID: 4F5B) active site. Important amino acid residues involved in the binding are shown in 2D and 3D interactions

Also, the molecular docking study of the native ligand (PTR) with proto-oncogene tyrosine-protein kinase Src (PDB ID: 4F5B) receptor (-8.269 kcal/mol docking score) showed that the most interaction includes hydrogen bonding with amino acid residues HIE 204, ARG 178, ARG 158, SER 180, and GLU 181. So, the active site and the interactions are the same for our selected ligands with the native ligand.

ADME/Tox studies

QikProp application of Schrödinger software was used to predict the pharmacokinetic properties and Drug-likeness calculations of 21 lead compounds. Lipinski’s rule of five (Molecular Weight ≤ 500 g/mol, hydrogen bond donors ≤ 5, hydrogen bond acceptors ≤ 10, octanol-water partition coefficient ≤ 5), central nervous system activity (CNS), human oral absorption, PCaco, brain/blood partition coefficient (logBB) and polar surface area (PSA) were calculated for these 21 lead compounds. The results are reported in Table 7. As it is shown in Table 7, the molecular weight of all lead compounds is below 500 g/mol, hydrogen bond donors are up to 3, hydrogen bond acceptors are up to 8.50 and the octanol-water partition coefficient of all compounds 1.013 to 4.174 is estimated. Also, all compounds have a polar surface area of less than 140 Å, except for the NA-21 compound (with PSA = 143.487 Å). Percent human oral absorption is one of the important factors in predicting pharmacokinetic properties. This study predicted 100% oral absorption for 8 compounds (NA-4, NA-5, NA-8, NA-9, NA-10, NA-13, NA-14, and NA-15), and oral absorption is above 70% for most of the remaining compounds.

Table 7 ADME properties of the lead compounds

Molecular dynamics studies

Molecular dynamics simulations were performed for the most active candidate on each target receptor (NA-1, NA-10, and NA-18). During MD simulation, all systems are checked for structural and movements stabilities inside the active site, Fig. 9. After simulations, average of important parameters, for checking system stabilities during 100 nanoseconds are summarized in Table 8. H-bond interactions, explained what residues from receptors interact with ligands and which one has the most important role, Table 9. Results of energy minimization, total production run steps for three complexes (7BKG/ NA-1, 4ZT1/ NA-10, 4F5B/ NA-18) and finally RMSD plots for most important hydrogen bonding for ensuring stability and working reliability, after simulation of 100 ns, were shown in Figs. 10 and 11, respectively.

Fig. 9
figure 9

Ligands movement from 0-ns to 100-ns. a 7BKG/ NA-1, b 4ZT1/ NA-10, c 4F5B/ NA-18.

Table 8 Average parameters between proteins and ligands after simulation of 100 ns
Table 9 Hydrogen bonding between proteins and ligands after simulation of 100 ns
Fig. 10
figure 10

Potential plots for a1, a2) 7BKG/ NA-1, b1, b2) 4ZT1/ NA-10, c1, c2) 4F5B/ NA-18, after minimization and total production run steps, respectively

Fig. 11
figure 11

RMSD plots for most important hydrogen bonding of a 7BKG/ NA-1, b 4ZT1/ NA-10, c 4F5B/ NA-18, respectively, after simulation of 100 ns

Conclusions

This study aimed to identify new scaffolds with anti-gastric cancer properties against three protein receptors 7BKG, 4F5B, and 4ZT1. For this purpose, pharmacophoric hypotheses and 3D-QSAR models were generated, using these models, virtual screening was done to discover these new scaffolds. Fifty compounds with anti-gastric cancer properties were used to develop 3D pharmacophore models. Based on three of the best pharmacophore models, virtual screening and finally 3D-QSAR models were performed and the biological activity of hit compounds was approximately predicted. All the selected compounds from this stage had a molecular weight of less than 500 g/mol and a fitness score above 1.8 and matched more than 4 sites with pharmacophore models. Then the molecular docking of these compounds with three receptors 7BKG, 4F5B, and 4ZT1 was done with HTVS and XP methods. Finally, 21 compounds with high docking scores were selected, and ADME properties were calculated for them. Molecular dynamics simulations were performed for top-scoring ligand with their receptors (NA-1, NA-10, and NA-18). The results of our study showed that three pharmacophore models can determine the characteristics of gastric cancer inhibitors and show the relationship between the structure and activity of these compounds using 3D-QSAR models.