1 Introduction

The main target of clinical cures for NSCLCs is the transmembrane receptor tyrosine kinase epidermal growth factor receptor (EGFR), which belongs to the ErbB family. Until now, the third generation of EGFR inhibitors has been developed. Gefitinib [1] and Erlotinib [2] were the first reversible tyrosine kinase inhibitors and were quite effective in treating NSCLC. However, over 50% of first-generation EGFR patients developed a resistant mutation (T790M) in less than a year. To avoid the occurrence of resistance associated with the T790M mutation, a second EGFR inhibitor (Afatinib and Dacomitinib) was designed. However, these inhibitors have many adverse effects that limit their clinical use. The discovery of third-generation inhibitors reduced the toxicity of second-generation EGFR inhibitors such as WZ4002 [3], Rociletinib [4], and Osimertinib [5], as shown in Fig. 1.

Fig. 1
figure 1

The structures of the second line of EGFRL858R/T790M treatment drugs

In March 2017, the first and third generation of EGFR tyrosine kinases (Osimertinib) received complete Food and Drug Administration (FDA) approval to be second-line treatments for NSCLCs with EGFR-T790M [6]. The side effects of Osimertinib clinical use are nausea, diarrhea, hyperglycemia, and pneumonia [7, 8]. The severe side effects prompted the finding of a new practical, potent inhibitor of EGFRL858R/T790M, to exceed the defects of secondary effects and drug resistance that appeared with the use of Osimertinib. Several studies have been performed on the efficacy and the molecular mechanisms of Osimertinib and the influence of EGFR-TKIs as pretreatment on the efficiency of Osimertinib in patients with the EGFR T790M mutation [9, 10]. In addition, recent research examined the impact of different acrylamides on the activity and selectivity of Osimertinib [11], which indicates that Osimertinib analogues were more active and selective for the EGFR T790M mutant compared to Osimertinib. The results encourage the design of new potent Osimertinib analogues targeting the H1975 cell line. Compounds based on heterocyclic rings play a vital role in drug design and the development of potential drugs [12,13,14]. Pyrimidine [15], thiazole [16], quinoline [17], and imidazole derivatives [18] exhibit diverse biological activities, with pyrimidine being particularly significant in the synthesis of antitumor drugs, including EGFR kinase inhibitors [19, 20]. In recent studies, the pyrimidine scaffold has been explored as a novel 1H-pyrazolo[3,4-d] pyrimidine analog, exhibiting remarkable inhibitory action against EGFRL858R/T790M kinase by Gaber et al. [21]. Similarly, Hao et al. conducted research on a series of pyrimido[4,5-d]pyrimidine-2,4(1H,3H)-dione congeners, which displayed specific EGFR inhibitory action [22]. Novel 2,4-diaryl pyrimidines were synthesized and evaluated by Jianheng et al. as selective EGFRL858R/T790M inhibitors, with the most promising substance exhibiting excellent kinase inhibitory action against EGFR double mutation and inhibiting the proliferation of cancer cells containing the EGFRL858R/T790M mutation [23].

The present research uses computational approaches: 3D-QSAR modeling, molecular docking, and molecular dynamic simulation (MD) to investigate new potent EGFRL858R/T790M kinase inhibitors. The 3D-QSAR study was utilized to develop a structure of 28 acrylamide derivatives compounds to prove the inhibition effect of EGFRL858R/T790M kinase. The stability of the four suggested compounds was determined using ligand-receptor interactions detected using molecular docking and dynamic simulation. Finally, the toxicity of new compounds was evaluated by studying ADMET properties.

2 Materials and Methods

2.1 Data Set

Based on previous studies, a dataset of N-(3-amino-4-methoxyphenyl) acrylamide derivatives (28 compounds) with antiproliferative action against H1975 cell lines was used in this study [24]. The 3D-QSAR models were produced by splitting the data into two sets: a training set of 23 compounds and a test set of 5 compounds. In the CoMFA and CoMSIA models, the activity values (IC50 in nM) were evolved into matching pIC50 values (−log IC50) and utilized as the primary dependent variable (Table 1).

Table 1 Cytotoxicity of target compounds against H1975 cell line

2.2 Minimization and Alignment

Structural alignment is a capital step and the critical parameter in 3D-QSAR modeling [25,26,27]. The twenty-eight molecules were constructed with a sketch module and minimized using the corresponding Gasteiger–Huckel atomic partial charges under the standard Tripos force field [28] and 0.001 kcal/mol as the convergence criterion of the Powell gradient algorithm [29] on the SYBYL-X 2.0 program [28, 30]. The aligned data set is displayed in Fig. 2.

Fig. 2
figure 2

Aligned compounds of data set using compound 17i as a template

2.3 3D-QSAR Modeling

After the alignment step, several CoMFA and CoMSIA models were established to find a reliable model using SYBYLX 2.1. A hybridized sp3 carbon atom with a Van Der Waals radius of 1.52 was used in the CoMFA technique to compute the steric and electrostatic fields, with a default value of 30 kcal/mol defined for the energy cutoff calculations. The same grid was used for CoMFA approach, and the CoMSIA model was utilized to calculate additional fields (hydrophobic, H-bond donor, H-bond acceptor, steric, and electrostatic).

2.4 Y‑Randomization Test

To evaluate the proficiency of the created 3D-QSAR models, the Y-randomization study was put into use. The pIC50 values were arbitrarily shuffled and a novel 3D-QSAR model was produced after each change. Therefore, although we get low values for Q2 and R2, the original model is effective in predicting the activity of new inhibitors. Otherwise, the original model fails due to the overfitting problem of the training set.

2.5 Molecular Docking

Molecular docking is widely employed in drug discovery and understanding of molecular interactions. It allows to prevent the orientation and the optimal position of the ligand in the active site of the protein [31, 32]. In the present investigation, Surflex-Dock implemented in Sybyl 2.0 is used for molecular docking to study the modes of interaction between the ligands and the protein's active site. In practice, the protein was downloaded with a resolution of 2.35 Å from the RCSB protein database (PDB ID: 3W2O) [33]. Then, all of the water molecules in 3W2O were taken out and polar hydrogen atoms were inserted. Finally, the results were viewed with Discovery Studio 2017 [34] and PyMol software [35]. The re-docking protocol was utilized to validate the docking technique by superimposing the native co-crystallized ligand and its docked pose, then calculating the root mean square deviation (RMSD), which must be less than 2 Å [36].

2.6 Molecular Dynamics (MD) Simulations

Molecular dynamics simulations were carried out with the help of GROMACS package (GROMACS 2020.4) [37]. Using the CHARMM36 forcefield [38], the proposed molecules (T1, T2, T3, and T4) were operated in water for 100 ns. The system dissolves in the presence of TIP3P water molecules inside a truncated octahedral box. A quantity of potassium/chlorine ions was introduced to the combination to neutralize the entire system. In order to eliminate any steric conflicts, the convergence was attained within the maximum force 1000 (kJ mol−1 nm−1) for 5000 steps utilizing the Steepest Descent Method. To guarantee a wholly converged system for the production run, all systems were equilibrated at NVT and NPT ensembles for 100 ps (50,000 steps) and 1000 ps (1,000,000 steps), utilizing time steps of 0.2 and 0.1 fs, respectively. The simulations were run using the Parrinello–Rahman and weak coupling velocity-rescaling methods at a constant temperature of 300 K and a constant pressure of 1 atm or bar (NPT). The verlet technique was used to calculate non-bonded interactions. Using the Particle Mesh Ewald (PME) method, the electrostatic interactions and forces were calculated to account for a homogeneous medium outside the long-range limit. The complex's production took 100 ns.

2.7 Binding Energy Calculations

MM-PBSA binding energy of proposed compounds (T1–T4) was calculated for biomolecular interactions by the g-mmpbsa tool [39]. This tool uses the GROMACS and APBS packages to compute the enthalpy components of the MM-PBSA interaction.

2.8 In Silico Pharmacokinetics ADMET Study

Many candidate drugs do not go through clinical trials for ADMET properties (adsorption, distribution, metabolism, excretion, and toxicity) that are unacceptable for their toxicity or inefficiency [40, 41]. Several online tools that help predict ADMET properties, such as pkCSM [42] and SwissADME online tools [43], are employed to predict the ADMET properties of target-designed compounds.

3 Results and Discussions

3.1 3D-QSAR Models

The optimal CoMFA and CoMSIA models and numerous field combinations that showed cross-validated coefficient values greater than 0.5 are given in Table 2. The CoMFA model was chosen, characterized by the best statistical parameters (a significant value of the cross-validated coefficient Q2 and a correlation coefficient R2). Q2 = 0.663 characterized the CoMFA model, R2 = 0.978, N = 3, SEE = 0.115, and Rtest2 = 0.756, which indicates that this model has good stability and strong predictive ability. The statistical findings also indicate that the steric and electrostatic field contributions are 67% and 33%, respectively, suggesting that the steric field play a more important role in enhancing inhibitory activity. Figure 3 presents the correlation among predicted and observed pIC50 for training and test sets for the CoMFA analysis.

Table 2 Numerous molecular field combinations of CoMFA and CoMSIA models with statistical results
Fig. 3
figure 3

The optimal CoMFA model plots

3.2 3D-QSAR Contour Maps

The CoMFA contour maps have provided information on favorable and unfavorable regions of various fields, such as steric and electrostatic, in 3D space that can increase or decrease activity. Using molecule 17i as a template, the CoMFA steric and electrostatic contour maps were displayed in Fig. 4a, b.

Fig. 4
figure 4

CoMFA contour map of the steric field and electrostatic field with compound 17i

In the steric contour map of CoMFA (Fig. 4a), the yellow contour located on the pyrimidine group suggests that adding voluminous radicals at this site may decline the biological activity of compound 17i. The large green zone around methoxy and meta positions of the phenyl group shows the possibility of appropriate substitution and the addition of bulky radicals to enhance activity. At the same time, we see that the tiny green region surrounding the methyl radical on the indole group indicates the possibility of replacing the methyl group with another group to obtain the desired activity.

Likewise, in CoMFA electrostatic contour map (Fig. 4b), we can see that the contribution of the red contour is more dominant than that of the blue counterpart, suggesting that electropositive groups or atoms can decrease activity. In addition, the blue region near the phenyl fraction associated with piperidine indicates that electron-donation substitutes are favored at those positions that would exhibit good inhibitory activity. Furthermore, in the blue region near the pyrimidine group and the meta position of phenyl group, substituents with electron-donating properties are preferred at positions with good inhibitory activity.

3.3 Y-Randomization Test of Model

To confirm that the 3D-CoMFA model is not due to an accidental correlation of the training set, we used a y-randomization test. Numerous random interferences of the dependent variable (pIC50) were realized. Table 3 shows the low Q2cv (LOO), and R2train values attained, confirming that the CoMFA model is appropriate for predicting new inhibitors.

Table 3 Rtrain2 and QLOO2 values after Y-randomization tests

3.4 Designing of New Potent Inhibitors

Based on CoMFA contour maps on the structural characteristics of compound 17i, new analogues of N-(3-amino-4 methoxyphenyl) acrylamide were designed. The new compounds (T1, T2, T3, and T4) and their predicted activities are exposed in Table 4.

Table 4 The newly designed molecules and their predicted activities.

3.5 Molecular Docking Results

Molecular docking was carried out to understand the nature of the interactions between these new candidates (T1, T2, T3, and T4) in the active site of the Protein (PDB ID: 3W2O) and compare them with Osimertinib as EGFR inhibitors. The results are displayed in Table 5.

Table 5 2D and 3D visualizations of docking results of the proposed compounds and Osimertinib with the active site of the3W2O receptor.

The docking result (Fig. 5) of compound 17i shows that the stability of this molecule is due to the diversity of types of interactions, such as the interaction carbon-hydrogen bond with Leu-788; Ala-743; Met-:793; Gln-791 residues, pi-alkyl interactions with Ala763; Ile759; Val726; Leu792; Lys745 residues, Pi-anion Glu762; Pi-sulfur Met766; Met790; and Pi-sigma with Leu718 residue.

Fig. 5
figure 5

Docking results of compounds 17i with EGFR tyrosine kinase protein (PDB ID: 3W2O)

In addition, docking of the Osimertinib compound shows a hydrogen bond with Leu-788, and Leu-718 residues, a Pi-alkyl with lys745, Ala743, Phe723 residues, and a single Pi-Sigma bond with Val726 residue. Moreover, molecular docking results show the importance of the residues Leu788; Leu718; Lys745; Val726; Ala743; Phe723; Met790; Met766, and Glu762 in the active site of EGFR kinase protein (3W2O).

Finally, all proposed molecular structures (T1–T4) show more exciting interactions and affinity (Table 6). Compared to third-generation EGFR Osimertinib, the newly suggested compounds have better inhibitory activity, making them suitable inhibitors.

Table 6 Binding interaction of proposed compounds and Osimertinib against EGFR

3.6 Molecular Docking Validation

The co-crystallized ligand was removed from the protein (PDB ID: 3W2O) and re-docked in the same position to validate the reliability of the molecular docking technique. Figure 6 shows the superimposed view between the original ligand (red color) and the re-docked ligand (blue color) with root mean square deviation (RMSD) of 1.765 Å within reliable range of 2 Å. The docked ligand is connected at a comparable location to the co-crystallized ligand. Consequently, the ligands could interact with the same amino acid residues reported in the co-crystallized ligand. As a result, the docking validation outcomes demonstrate the efficiency and validity of the molecular docking technique.

Fig. 6
figure 6

Re-docking pose of the co-crystalized ligand and RMSD value of 1.765 Å (Red = Original, Bleu = Docked)

3.7 Molecular Dynamic Results

3.7.1 Root Mean Square Deviations RMSD

Using the GROMACS algorithm, RMSD was calculated for the complex (CT1, CT2, CT3, and CT4) based on “Backbone” atoms. RMSD graph (Fig. 7, Row 1) for the protein complex shows that the structure remained stable throughout the simulation time with some fluctuation within the range of ~ 1 Å, typical behavior of the globular Protein. The average value of backbone RMSD is around 2 Å for all four complexes. RMSD was calculated for the Ligand (T1, T2, T3, and T4) based on the Ligand's atoms using GROMACS program, which is displayed in (Fig. 7, Row 1). All the ligands (T1, T2, T3, and T4) remain bound throughout the simulation and have a stable RMSD.

Fig. 7
figure 7

From right to left: a RMSD, b RMSF, and c Radius of gyration of the complexes during 100 ns MD simulation. Compounds CT1 (Row 1), CT2 (Row 2), CT3 (Row 3) and CT4 (Row 4)

3.7.2 Root Mean Square Fluctuations RMSF

Except for specific residues that define a loop or turn in the 3W2O Protein, the RMSF was calculated using the “C-alpha” atoms of the GROMACS, where the fluctuation intensity remained below 3.0 (Fig. 7, Row 2).

3.7.3 The Radius of Gyration (ROG)

Using the GROMACS, Rog was determined for the complex (T1, T2, T3, and T4) based on "C-alpha" atoms the complex based on "C-alpha" atoms was determined. Indicating the stability and compactness of the structures, all four complexes (T1, T2, T3, and T4) display a very stable radius of gyration with a fluctuation of less than 1 Å. The minor opening and shutting of the N and C terminal domains during the MD simulation duration are indicated by the slight variation in the 1 Rog value (Fig. 7, Row 3).

3.7.4 Hydrogen Bonds (Protein–ligand)

Over 100 ns of the simulation time, the total number of hydrogen bonds formed between T1, T2, T3, T4, and 3W2O is shown in (Fig. 8, Row 1). All the ligands (T1, T2, T3, and T4) show a stable network of hydrogen bonding with the 3W2O, with an average number of hydrogen bonds during the simulation.

Fig. 8
figure 8

From right to left: A Hydrogen Bonds (Protein–ligand) and B Average distance between Ligand and the Protein for of the complexes during 100 ns MD simulation. Compounds CT1 (Row 1), CT2 (Row 2), CT3 (Row 3) and CT4 (Row 4)

3.7.5 Average Center-of-Mass Distance

The average Center-of-Mass Distance between (T1, T2, T3, and T4) and 3W2O during 100 ns of the simulation time are exposed in (Fig. 8, Row 2). From the COM-COM distance, none of the ligands (T1, T2, T3, and T4) leaves its binding site, and the distance either decreases or stabilizes after 40 ns to 60 ns of simulation time.

3.7.6 Contact Frequency (CF) Analysis

The contact Freq.TCL module in VMD, with a cutoff of 4 Å, was used to perform contact frequency (FC) analysis. The binding between the 3W2O and the test ligand (T1, T2, T3, and T4), where the residues with higher CF%, are presented Fig. 9. The residues with the highest contacts were Leu718, Phe723, Val726, Ala743, Lys745, Ile759, Met766, Leu788, Met790, Leu792, Met793, Arg841, Leu844, Thr854, Asp855, and Glu762.

Fig. 9
figure 9

Contact frequency (CF) analysis

3.7.7 Potential Energy, Pressure, and Temperature

The system potential energy, pressure, and temperature during 100 ns of MD simulation, as achieved from GROMACS edr file, are shown in Fig. 10. Throughout the 100 ns simulations, the graph shows the convergent potential energy, temperature, and pressure.

Fig. 10
figure 10

From left to right: A Temperature, B pressure, and C potential energy during the 100 ns MD simulations

3.7.8 MM/PBSA Binding Energy

The Molecular Mechanics/Poisson Boltzmann Surface (MM/PBSA) method was selected to re-score the binding free energies of the complexes T1, T2, T3, and T4, because it is the fastest force field-based method for calculating the binding free energy compared to other free energy calculation methods, such as the perturbation free energy (PFE) or thermodynamic integration (TI) methods. The calculated binding free energies are presented in Table 7.

Table 7 Binding free energies of tested compounds [kJ/mol]

3.8 ADMET Properties

To predict the ADMET properties of new compounds (T1–T4), pKCSM and SwissADME online tools were used, as listed in Table 8.

Table 8 ADMET prediction of new proposed molecules inhibitors

Molecules with less than 30% intestinal absorption are considered minimally absorbed. However, all new compounds have a better value than 89%, indicating good absorption in the human intestine. The volume of distribution (VDss) is regarded as high when it exceeds 0.45. Furthermore, blood–brain barrier (BBB) and central nervous system (CNS) permeability values are given (0.3 < Log BB < −1 and −2 < LogPS < −3, respectively). For a particular medicine, LogPS larger than −2 suggests CNS invasion, while LogPS < −3 indicates that it is challenging for the drug to enter the CNS. Molecules with logBB greater than −1 are thought to be widely distributed in the brain, but LogBB > 0.3 can cross the BBB. Therefore, the BBB permeability results indicate that the BBB is not penetrating for the four proposed molecules.

Cytochrome P450 (CYP) enzymes are a group of enzymes that have a major role in the metabolism of various compounds, including drugs, environmental chemicals and endogenous substrates. These enzymes are widely distributed in various tissues and organs. CYP3A4 is the most abundant CYP enzyme in the liver and is responsible for the metabolism of more than 50% of all drugs. The results show that all designed compounds are substrates and inhibitors to CYP3A4. A lower clearance index score denotes a drug's ability to persist in the body. The findings demonstrate that each compound under investigation has acceptable clearance index values, indicating its persistence in the body. Finally, the negative Ames toxicity test indicates that all compounds are not mutagenic.

4 Conclusion

The results of this research, which aim to investigate the inhibition potency of N-(3-amino-4-methoxyphenyl)-acrylamide derivatives (28 compounds) as EGFRL858R/T790M kinase inhibitors, showed excellent predictive power of the CoMFA model with good statistical results (Q2 = 0.663, R2 = 0.978, Rtest2 = 0.756). Four compounds (T1–T4) with solid inhibitory activity were designed according to CoMFA contour maps. In addition, the docking analysis of the designed compounds (T1–T4) revealed more types and interactions than Osimertinib as an EGFR inhibitor. A 100 ns MD simulation of the four newly designed molecules (T1–T4) showed small fluctuations of RMSD and RMSF, confirming the outcomes of the 3D-QSAR model and molecular docking. The results of ADEMT showed favorable pharmacokinetic properties and non-toxicity. These results will help optimize the discovery of new drugs that can address multidrug resistance. Further research can be carried out to synthesize and evaluate the in vitro activity of the new compounds generated as EGFRL858R/T790M kinase inhibitors.