Introduction

The increasing incidence of cancer has made it one of the most serious public problems for mankind and how to effectively treat cancer has always been an important challenge for public health care [1]. Bromodomain-containing protein 4 (BRD4) is a key epigenetic regulator in cancer with the most important functional protein in the bromodomain and superterminal family protein (BET) family, which contains two bromodomains and one superterminal domain [2]. Researches have shown that BRD4 regulates cell proliferation and apoptosis by recruiting positive transcription elongation factor (PTEFb) to activate C-MYC, C-JUN, etc., therefore plays an important role in the occurrence and development of tumors and the infiltration and migration processes of tumor cells [3]. The occurrence and development of a variety of malignant tumors such as lung cancer, breast cancer, hematological tumors and liver cancer are all related to the dysfunction and overexpression of BRD4 [4, 5]. Similarly, inhibiting signaling pathways to resist the overexpression of BRD4 protein can effectively treat cancer diseases that have been developed as a therapeutic target for cancer drug research. The molecule ( +)-JQ1 developed by Qi et al. [6, 7] containing the triazolodiazadiazepine core skeleton was identified as a selective inhibitor of the BET subfamily and proved in the xenograft model of NMC antitumor effect. Pfizer [8] discovered a new type of scaffold 3,4-dihydro-3-methyl-2(1H)-quinazolinone through fragment drugs. In the peptide replacement biochemical experiment, the scaffold was further optimized for its IC50 was about 0.1 ~ 0.5 μM. Nevertheless, the emergence of drug resistance leads to the effects of target protein inhibitors with specific modes of action were limited gradually [9,10,11], urgent development of new BRD4 inhibitors turns into promising means to treat cancer.

To avoid side effects caused by drug resistance, the discovery of the new dual inhibitor molecule BI-2536(Fig. 1a) which play an ideal inhibitory effect with the BRD4 and PLK1 protein kinases normally drive cancer makes the progress of anticancer drugs even further [12, 13], BI-2536 Inhibitor's good anticancer effect and fewer side effects prompt it quickly become one of the most promising anticancer drugs [14]. The docking results show in Fig. 1b that BI-2536 in the form of R78 500 ligand directly forms stable hydrogen bonds with ARG57, LEU59, CYS133 with PLK1 mutase(O–H–N-ARG57, 2.7 Å; N–H–O-LEU59, 3.1 Å; N–H–O-CYS133, 3.1 Å; N–H–N-CYS133, 2.9 Å), also LYS82, GLU101, HIS105, GLU131, ARG135/136 and other amino acid residues that utilized water molecules as mediator to form hydrogen bonds. In Fig. 1c that the BI-2536 inhibitor in the form of R78 202 ligand forms direct stable and strong hydrogen bonds with ASN140 residues into the first bromodomain of BRD4 through the oxygen atom on formamide(O–H–N-ASN140, 2.9 Å). Other surrounding residues GLN85, VAL87, TYR97/139, ASP144/145 also form hydrogen bonds with the R78 202 through the bridge of water molecules. Furthermore, it is tough to obtain inhibitor molecules with inhibitory activities, which requires a lot of time-consuming and labor-intensive experiments. Elucidating the relationship between its molecular structure and BRD4 inhibitory activity may be able to provide effective help for the development of anticancer drugs.

Fig. 1
figure 1

The structure of BI-2536 and the result of docking with the dual-target protein. a: 2D structure of BI-2536 inhibitor. b: Description of the docking effect of BI-2536 as R78 500 ligand form with PLK1 protein(2RKU); c: Description of the docking effect of BI-2536 as R78 202 form ligand with BRD4 protein(4OGI) and amino acid residue surrounding in ligand shown in stick format and colored by atom type. Ligand is displayed in a stick format and is colored by atom type (green = carbon)

The good suppression performance of BI-2536 inspired researchers to optimize and analyze its suppression framework[15]. Hu et al. [16] rationally designed, synthesized and evaluated a series of novel dihydroquinoline-2 (1H)-1 derivatives as selective bromine domain inhibitors inspired by BI-2536. Chen et al. [17] used the strategy of modifying the structure of the BI-2536 to obtain a more selective inhibitor for BRD4 protein with a moderate inhibitory effect on PLK1. The summarized structure–activity relationship may help to fine-tune in this way ligand capabilities for expanding personalized medicine. Wang et al. [18] later substituting 1, 2, 4-triazole for the N-methyllactam of BI-2536 provides a series of new dual high inhibitory activity compounds with PLK1 selectivity through screening, which was also confirmed by recent studies on BRD4 inhibitors[19]. Based on that, obtaining more complete structural modification information for BRD4 selectivity may help delineate the biochemical pathways and multidrug profiles targeted by other drugs. Therefore, we attempt to modify newest type of molecular structure synthesized by Wang for increasing the selectivity of the ligand compounds for BRD4 protein. The 2D/3D-QSAR approaches provide 2D/3D chemical and electrical information instead of just simple structural transformation, which would be helpful for relieving severe restrictions on the emergence of drug-resistant mutations.

Quantitative structure–activity relationship (QSAR) has been widely used to assist drug design in recent years, 3D-QSAR could introduce the three-dimensional structure information of drug molecules for quantitative structure–activity relationship research, indirectly reflect the relationship between drug molecules and macromolecules and the non-bonding interaction characteristics of the molecular interaction process[20,21,22]. The use of the QSAR method as a precursor for synthetic drugs has extensive application prospects on account of its high efficiency and accuracy [23]. In this research, the 3D-QSAR model that including CoMFA, CoMSIA, 2D-QSAR model (HQSAR) was used to analyze the relationship between the inhibitory ability and structure of small molecule BRD4 protein inhibitors. Through the contour map of CoMFA, CoMSIA, topomer CoMFA and the color code map of HQSAR, we obtained the structural modification hint of compound 15 with the highest activity and verified the QSAR prediction results by designing new compounds. In addition, the molecular docking results showed that the N atom on triazole as a hydrogen bond acceptor is essential for improving inhibiting ability. The high model predictive ability and verification results can provide leading conditions for designing promising BRD4 inhibitors.

Material and methods

Set division and data establishment

37 inhibitor compounds with BRD4 protein inhibitory activities were selected from the literature [18] and are listed in Table 1. Ten compounds were randomly and representatively selected as the test set, and the remaining molecules were used as the training set to build the 2D/3D-QSAR model. These activity values (IC50 in nM) were converted to corresponding pIC50 (− logIC50) values and used as a dependent variable in QSAR models. The SkechTool in SYBYL 2.0 was employed to build the 3D conformation of each compound. The energy of the drawn compounds is minimized by adopting the Tripos force field [24] with the Powell conjugate gradient minimization algorithm. The energy convergence criterion of 0.005 kcal/mol Å, Gasteiger-Huckel [25]charge was used to calculate the atomic charge given by the system. Maximum iteration was set to 1000 to obtain stable conformation. Put the drawn molecular data in a folder and open it as a worksheet for adding the activity data value to the SYBYL spreadsheet as TXT text. Other parameters adopt the default value of SYBYL 2.0.

Table 1 Compound structure and their actual IC50 and pIC50 values

Molecular alignment

Molecular alignment is a significant condition that affects if QSAR modeling (contains CoMFA, CoMSIA, HQSAR) is ideal or not [26]. The predictive ability of the models depends on the reliability of the contour map [27]. The most active compound 15 was selected as the overlay template to obtain the best results, use rigid alignment to maintain the lowest energy conformation of each compound, then selected the largest common frame of the compound for stacking. The alignment function in the SYBYL2.0 package was employed for conformational superposition, the alignment results are shown in Fig. 2.

Fig. 2
figure 2

Molecular alignment mode. a: Molecular alignment based on compound 15; b: 2D structure of compound 15(blue represents the common skeleton)

CoMFA study

Comparative molecular force field analysis(CoMFA) calculated by steric and electrostatic fields is the most widely used method in QSAR research [28]. Molecules are superimposed through reasonable matching rules after the collection. The CoMFA calculation tool generates a region that can accommodate all molecules and maintain a certain distance from the outermost atoms according to the cartesian coordinates. The region was divide into multiple grids adapting a certain distance to calculate the non-covalent interaction energy of different fields on each grid point [29]. The three-dimensional field action is calculated by Lennard–Jones formula (1), and the electrostatic field calculation adopts the Coulomb formula (2). Since more independent variables than dependent variables on account of a large number of grids, the least squares method (PLS) [30] is applied to expound the relationship between statistical activity and variables. The correlation coefficient \({q}_{cv}^{2}\)(cross-validation coefficient),\({r}_{ncv}^{2}\) (non-cross-validated correlation coefficient) obtained by cross-validation are usually used as the basis to measure the predictive ability of the CoMFA model [31].

$$ E_{vdW} = \mathop \sum \limits_{i = 1}^{n} \left( {A_{j} r_{ij}^{ - 12} - C_{j} r_{ij}^{ - 6} } \right) $$
(1)
$$ E_{C} = \mathop \sum \limits_{i = 1}^{n} \frac{{q_{i} q_{j} }}{{Dr_{ij} }} $$
(2)

where \({E}_{vdW}\) represents the space potential energy of the compound; \({A}_{j}\) and \({C}_{j}\) represent the corresponding atomic van der Waals radius constants; n is the total number of atoms in the molecule; \({E}_{C}\) is the electrostatic field energy of the compound; \({q}_{i}\) is the net charge of the atom calculated by the Gasteiger-Hckel method; \({q}_{j}\) is the probe charge of the atom; D is the dielectric constant.

CoMSIA study

A comparative molecular similarity index analysis (CoMSIA) was established to get over the limitations of CoMFA. In CoMSIA, the molecular index SEAL similarity calculation field is used as a descriptor while considering space, static electricity, hydrophobicity and hydrogen bonding characteristics. These indicators are estimated indirectly by comparing the similarity between each molecule in the data set and a common probe atom (radius of 1 Å, the charge of + 1, hydrophobicity of + 1) [32]. Compared to CoMFA, CoMSIA uses a Gaussian function and the similarity index of the Gaussian distribution to avoid mutations in the grid-based probe atom interaction. Furthermore, the selection of similar probes is not limited to space potential or electrostatic potential, but also includes hydrophobic and hydrogen bonding [33] (hydrogen bond acceptor and donor fields).

Topomer CoMFA study

Topomer CoMFA is a CoMFA modeling method based on topomer technology, which eliminates the steps that affect the prediction results, such as molecular overlap and defines the activity value of the compound as the contribution value represented by each fragment through molecule cutting [34]. Topomer is similar to CoMFA in creating a spatial contour map produced by a stereocenter and position. The Tripos force field method of + 1 atomic charge is used as static electricity. Put pIC50 as an independent variable, use PLS, spatial, electrostatic descriptors to build a model [35]. Herein, the topomer CoMFA tool in the SYBYL 2.0 software package was employed for adopting the same cutting method to divide 37 compounds into 2 or more fragments for establishing a reasonable 3D-QSAR model. The predictive ability of the QSAR model largely depends on the position of molecule cutting.

HQSAR study

HQSAR is a 2D-QSAR analysis method based on the structure of atomic fragments without requiring molecular 3D conformations [36], which converts the chemical representation of molecules into corresponding molecular holograms to realize molecular arrangement and conformation specification [37]. The SLN (SYBYL line notation) of each created segment is mapped to a unique integer [38]. The two-dimensional structure is converted into a characteristic signal and labeled with a random number of 53–401 to establish the relationship between the specific atom of the compound and the biological activity [39]. Different fragment descriptors that including atoms (A), bonds (B), connections (C), chirality (Ch), hydrogen atoms (H), donor and acceptor (DA) were utilized in combination with adjusting dimensions and holographic length to obtain a HQSAR model with good predictive ability.

Partial least square analysis

The partial least squares (PLS) [30] method was employed to deal with the linear correlation between fields and biological activity on account of a large number of independent variables. First, the cross-validation method with leave-one-out is used to select the model which has the most predictive ability, and then the regression analysis is performed through the obtained optimal number of components(N) [40]. In addition, non-cross-validation analysis is performed based on the best number of components to evaluate the model fitting ability. Finally, import the test set to predict its biological activity and evaluate it through QSAR models. \({q}_{cv}^{2}\) and \({r}_{ncv}^{2}\), standard error of estimate (SEE), the Fischer ratio value (F) obtained based on the optimal number of components deemed to be important factors for judging the predictive ability of the model. Additionally, the value of \({{r}_{ncv}^{2}}_{\mathrm{pred}}\) is calculated using the test set of compounds by formula(3). Higher \({q}_{cv}^{2}\), \({r}_{ncv}^{2}\), \({{r}_{ncv}^{2}}_{\mathrm{pred}}\),(\({q}_{cv}^{2}\) > 0.5, \({r}_{ncv}^{2}\) > 0.6, \({{r}_{ncv}^{2}}_{\mathrm{pred}}\)> 0.5) represent the good predictive ability of the established model [41].

$$ r_{{\text{pred}}}^{2} = \left( {\frac{{SD - {\text{PRESS}} }}{SD}} \right) $$
(3)

where SD is the sum of squares of deviations between the average activity of molecules in the training set and the biological activity of molecules in the test set; PRESS is the sum of squares of deviations between the predicted and actual activity of the molecule in the test set.

Molecular docking

Molecular docking simulation experiments could illustrate the binding conformation of small-molecule inhibitors with target proteins, analyze the structural requirements for inhibitors to produce inhibitory effects [42]. Surflex-dock docking software in SYBYL was applied to clarify the binding of inhibitor molecules to the target BRD4 protein. Surflex-dock adopts a specific site docking method to remove the original ligands (mostly inhibitors) on the target protein and place the docked molecules at a fixed site. Compare docking similarity to verify whether the docking result is reasonable [32]. The docking protein is selected from the PDB database (PDB ID: 4OGI), which is the crystal structure of the first bromodomain of human BRD4 (https://www.rcsb.org/structure/4OGI) combined with R78 202 [43]. The protein downloaded before docking must undergo a process of ligand extraction, hydrogenation, add charging, removal of water molecules to ensure the clear and complete protein structure [44]. The ligand is prepared by adding a Gasteiger-Huckel charge with energy minimization performed by the Powell method [24]. The docking result is usually scored by the C-score function, including total score, crash and polar, as the judgment of the docking result [45]. The total score is greater than 5 consider stable interaction between the protein and the ligand with theoretically able to exert its inhibitory activity [46].

Results and discussion

CoMFA results

Table 2 summarizes the statistical results of the inhibitory activity of CoMFA models on BRD4. Considering both the electrostatic field and the steric field to analyze the CoMFA model of the training set by PLS, the optimal number of components is obtained in 3 that gives \({q}_{cv}^{2}\) values of 0.574, \({r}_{ncv}^{2}\) values of 0.947 and SEE of 0.165, F of 138.348, respectively. For the CoMFA model, only the spatial and electrostatic field contributions are calculated, and the values are 0.575 and 0.425, respectively, which proves that contribution of the stereo field to increase biological activity is greater than the electrostatic field. As shown in Fig. 3a, we plot the experimental values with the predicted values of the CoMFA model, it is found that the data are distributed on both sides of the straight line. These statistical results show that the model has a better predictive capability. The final results of the CoMFA model are listed in Table 3.

Table 2 Statistical parameters of CoMFA models
Fig. 3
figure 3

The plot of experimental and predicted activities based on training and test sets

Table 3 Calculated data for the CoMFA and CoMSIA model

CoMSIA results

Table 4 summarizes the statistical results of the inhibitory activity of CoMSIA models. Considering that there are five different descriptor fields related to each other, we first calculate the impact of each descriptor field on the predictive ability of CoMSIA results, which not only includes electrostatic field and the steric field, but also contains hydrophobic (H), H-bond donor (D), H-bond acceptor (A). It was found that the prediction results obtained when only H-bond acceptor (A) is considered to be most ideal. Combining other descriptors with H-bond acceptor (A) are necessary and we finally found that H-bond acceptor (A) combined with the steric (S) descriptor appears model that has the best predictive ability, which showed that \({q}_{cv}^{2}\) = 0.622, \({r}_{ncv}^{2}\) = 0.916, N = 4, F = 59.740. The contributions of two fields including the steric and hydrogen bond acceptor field are 0.469 and 0.531, respectively. Figure 3b plots the linear relationship between the prediction results of the CoMSIA model and the experimental values. It can be seen that the prediction ability of CoMSIA is ideal, and the relationship is basically linear. The final results of the CoMSIA model are listed in Table 3.

Table 4 Statistical parameters of CoMSIA models

Topomer CoMFA results

The prediction ability of the topomer CoMFA model largely depends on the cutting model of the molecular set. Compound 15 with the best inhibitory activity was chosen as the cutting template, use “spilled in two” or “two” to divide it into two or three fragments, and the remaining molecules are automatically cut according to the cutting method separation in progress. After cutting, PLS is employed to perform regression analysis on the topomer CoMFA results. As shown in Table 5, the prediction results given by the four different cutting methods are relatively ideal, with cutting method 4 obtains better results than others, which gives = \({q}_{cv}^{2}\)0.691, = \({r}_{ncv}^{2}\)0.912, F = 79.501 and \({{r}_{ncv}^{2}}_{pred}\) = 0.671. It is worth noting that the predictive ability of cutting method 2 is better than cutting method 4 given \({{r}_{ncv}^{2}}_{pred}\)= 0.699, while it is not meaningful to discuss its steric field and electrostatic field due to the limitation of the cutting method that only considers the contribution of a small substituent. Figure 3c shows the linear relationship between the experimental value and the predicted value of the topomer CoMFA model based on cutting method 4. As topomer CoMFA has a good predictive ability, the correlation coefficient in Fig. 3c is close to 1.

Table 5 Cutting model and statistical results of topomer CoMFA

HQSAR results

After the molecular superposition of the training set, eighteen HQSAR models were established by combining different fragment parameter descriptors. As shown in Table 6, the best HQSAR model is obtained while combining fragment descriptors atoms (A), bonds (B), connections (C), donor and acceptor (DA) and follow the default fragment length 4–7, which gives \({q}_{cv}^{2}\) to be 0.732 and \({r}_{ncv}^{2}\) 0.931, with a hologram length of 257 and N = 4. Keep the model fragment parameters unchanged and alter different fragment lengths to obtain better model results. It was found in Table 7 that the best model shows significant statistical metrics of \({q}_{cv}^{2}\) to be 0.759 and \({r}_{ncv}^{2}\) 0.963, with a hologram length of 61, N = 5 and fragment lengths 5–8. \({{r}_{ncv}^{2}}_{pred}\) obtained through external verification of the test set in 0.750. Figure 3d is a linear relationship diagram drawn between the pIC50 predicted by the HQSAR model and the experimentally pIC50 obtained from the HQSAR model based on optimal conditions, which is basically a straight line and the linear relationship is close to 1.

Table 6 HQSAR analysis of fragment distinctions in fragment size (4–7)
Table 7 Statistical parameters for the different HQSAR model distinctions

Analysis of the CoMFA contour maps results

In Fig. 4, the static field contour map in CoMFA and CoMSIA is composed of the green area (80% favored level) and yellow area (20% of the disfavored level), which represents the beneficial or detrimental effect to the the activity caused by the volume increase, respectively. The electrostatic contour map in CoMFA has composed of 80% favored level in the blue area and 20% disfavored level in the red area, which indicates favor electropositive and electronegative, respectively. The steric contour map of the CoMFA model shown in Fig. 4a that a yellow area is wrapped around the beginning of the R1 group, which means that increasing the volume of the substituent here is not conducive to improving the activity of the compound. For instance, the activity of compound 2 whose R1 group is cyclohexane which is significantly higher than that of the compound whose R1 is substituted with a benzene ring. Compared with other groups, compound 1 is directly connected at R1 without the rotatability of the methyl group and the greater steric hindrance leads to lower inhibitory activity. Besides, the large green area of the R2 group indicates that increasing the volume of the substituent here is beneficial to increase the activity. It is not difficult to find that the molecular activity of R2 = − methyl is much greater than that of R2 = − H inhibitor compounds.

Fig. 4
figure 4

a and b: CoMFA and CoMSIA steric contour maps around compound 15, respectively, favored (green) and disfavored (yellow); c: electrostatic contour maps of CoMFA: favored (blue) and disfavored (red); d: Hydrogen bond acceptor contour maps of CoMSIA: favored (magenta) and disfavored (cyan)

Figure 4b is the electrostatic field contour map of the CoMFA model. It was found that compound 15 is basically surrounded by a large blue area, indicating that increasing the positive charge of the group is very beneficial to improve the inhibitory activity. For example, the activity of the inhibitors will increase distinctly, whether it is adding an electropositive methylpiperazine group to the end of the amide the benzene ring group of R3 or adding an ethyl group at R2 (1 > 35, 9 > 18, 4 > 17, 2 > 23, 4 > 24, 7 > 26). It is worth noting that a red area appears at the junction with the amide end, which proves that adding an electronegative withdrawing group here will increase the activity of the inhibitor. Such as compounds with negatively charged amide groups have higher inhibitory than that of compound 17.18.35 without amide groups.

Analysis of the CoMSIA contour maps results

Figure 4c is the contour map of the CoMSIA model while considering the steric field. Similar as the results of CoMFA, the green area existes at the top of the R2 group and the yellow area presents at the linking point of the frame and the R3 group of compound 15, which proves that adding a bulky group at R2 is beneficial to increase the activity and an increase in bulkiness at the beginning of R3 is not conducive to improving the inhibitory activity. The difference from CoMFA is that a green area appears at the end of the amide group in R3 group and the thiophene at R1, indicating that adding a bulky group in this area is beneficial to improve the inhibitory activity. For example, when the C atom connected to the N atom of the amide is changed to a heavier N atom, the inhibitory activity increases (19 > 1, 21 > 3, 22 > 4); the conversion of the S atom in the thiophene group to the relatively lighter O atom causes the inhibitory activity to decrease (15 > 13,16 > 14, 33 > 31, 34 > 32).

The hydrogen bond acceptor field contour map in CoMSIA is composed of 70% favored level in the magenta area and 20% disfavored level in the cyan area, respectively, representing that provide hydrogen bond acceptor here is beneficial to increase the activity and provide hydrogen bond acceptor not conducive to improving the activity. Figure 4d is a contour map of the CoMSIA model generated when considering hydrogen bond acceptors, it was found that at the end of the amide of the most active compound 15, as well as the top of 4H-1,2,4-triazole appears the magenta area, proves that providing hydrogen bond acceptors here could increase inhibitor activity. For example, inhibitory activity improves while replacing the C atom at the end of the amide group with the N atom as a hydrogen bond acceptor (21 > 3, 22 > 4).

Analysis of the Tomoper CoMFA contour maps results

Similar to the CoMFA result, Fig. 5a is the stereo field contour map of compound 15 cutting to Ra group. The green area at the end of the R1 group indicating that adding a bulky group in this area is beneficial to improve the inhibitory activity. For instance, comparing the compound 13 with the most active compound 15, the inhibitory activity increases, while the R1 group changes from furan to thiophene, (9 > 10, 33 > 31, 34 > 32) that follows similar principles.

Fig. 5
figure 5

The topomer CoMFA model contour map of compound 15. a and c represent Ra and Rb steric contour maps around compound 15 fragments, respectively, favored (green) and disfavored (yellow); b and d represent Ra and Rb steric contour maps around fragments of compound 15, respectively, favored (green) and disfavored (yellow)

Figure 5b is the contour map of the electrostatic field of the compound 15 Ra group. It can be seen that a large red area wrapped around the beginning of the Ra group and the right side of the thiophene group, indicating that increasing the group’s negativity here is beneficial to improve the molecular inhibitory activity, which explains why the inhibitory activity of compound 22 with negatively charged groups on the benzene ring is higher that of compound 21 without substituents on the benzene ring. Also, since the electron-rich order is 2-thienyl > 3-thienyl > 2-furyl > 3-furanyl, the inhibitory activity showed 15 > 16 > 14 > 13. As the electron-donating ability increases, the inhibitory activity increases (7 > 6 > 5 > 4 > 3).

Figure 5c is a contour map of the stereo field contour of the Rb group of compound 15. A green area appears at the piperidine ring at the end of the amide, which proves that adding a bulky group here is beneficial to increase the inhibitory activity and the results are similar to the model of CoMSIA. For example, the inhibitory activity 1 > 19, 21 > 3, 22 > 4, as each pair of compounds has the similar structure except that one C atom of the piperidine ring is replaced with a more bulky N atom. Additionally, the activity of other inhibitor molecules with piperidine ring or piperazine group in the training set is higher than compound 17, 18, 35 without these two groups.

Figure 5d is a contour map of the electrostatic field exhibited by the Rb group cut from compound 15 that is similar to the results of the CoMFA model. We can see that there is a red area above the amide group, indicating that adding a negatively charged group here is beneficial to increase the activity. For example, compounds 17, 18 and 35 without an amide group have low activities compared to others. A blue area at the connected C atom, indicating that adding positively charged group here is beneficial to increase the active group. Changing the carbon atom linked by the amide to an N atom will increase the activity (1 > 19, 21 > 3, 22 > 4).

Analysis of HQSAR contribution maps

The HQSAR atomic contribution map can show the influence of a single atom or fragment on the activity of inhibitors. This information helps us to further optimize the structure from a 2D perspective. The HQSAR contribution map of compound 15 with the highest inhibitory activity(IC50 = 77 nM) and compound 35 with the worst inhibitory (IC50 = 10000 nM) effect to analyze the structural relationship. In the contribution maps, the atoms of different colors reflect the influence of their structure on the activity. The green or yellow on the contribution map shows a positive contribution, while orange or red indicates a negative contribution. Fragments of atoms show white as an intermediate contribution to the biological activity.

Figure 6a is the HQSAR contribution graph of compound 35 with the lowest activity. The two carbon atoms below the benzene ring and the pyrimidine ring connected to it have red atoms, indicating that these atoms are not conducive to improving the activity of the inhibitors. Other structures appear as white neutral contributions, which in general leads to the lowest overall activity of the compounds. Compared with the lowest active compound 35, the HQSAR contribution map of the most active compound 15 shown in Fig. 6b has a positive contribution, yellow atom appears at the top of the thiophene group indicating a beneficial contribution to increasing the activity. Unlike compound 35, while the piperidinyl group linked by the amide group is added to the benzene ring, the newly added oxygen atom, the methyl group at the end of piperidine and the hydrogen atom on the piperidine ring all provide positive contributions. The comparison results show that adding an amide group to the benzene ring has a significant effect on improving activity.

Fig. 6
figure 6

Atomic contribution diagrams of compound 35 (a) and compound 15 (b)

Molecular docking

Molecular docking could illustrate the binding of small-molecule inhibitors as ligands to target proteins. Since we use Surflex-dock software to do docking at specific points, present ligands should extract from the original protein to analyze the docking results at specific points according to the different generated conformations. The re-docking step is necessary to verify our docking results, which means to dock the original ligand back and compare the conformation existing in the protein during docking. Figure 7a shows the necessary protomol that is consistent with the binding site of BI-2536. Figure 7b is the conformational overlay of the original ligand with the reference ligand performed by re-docking, their pose almost completely with similarity given by the scoring function to be 0.85, indicating that the docking method we adopted is reasonable. The re-docking consequence clearly shows that the means we adopted can accurately reveal the binding of small-molecule ligands to the target protein BRD4.

Fig. 7
figure 7

Ligand re-docking result and template compound docking result. a: Model of interface bag for the specific site of original ligand(silver represents BI-2536 as the active site of R78 202 ligand); b: Superimposition of the reference ligand (the green stick represents the re-docked ligand, the red stick represents the reference ligand); c: The original ligand R78 202 re-docking result with the BRD4 protein; d: The docking result of the best binding of template compound 15 with BRD4 target protein; e: Docking results of compound 20 with the best score binding in the active site of the BRD4 target protein; f: Docking results of compound 35 with the best score binding in the active site of the BRD4 target protein (magenta dotted line represents hydrogen bonds, the green sticks represent amino acid residues)

Comparing the docking results of BI-2536 to that of template compound 15 using the suitable drug ligand position in Fig. 7c, d, their pose are the same as a hydrogen bond was formed with ASN140 amino acid residues. The result of re-docking gives total score, crash and polar values at 9.2580, -0.6745 and 1.1324, respectively. The docking results of the template compound 15 of the total score, crash and polar values are 8.4884, − 1.8777 and 1.1556, respectively. In summary, two different docking results show that the small-molecule inhibitor binds firmly to the target protein. It is worth noting that the formation of hydrogen bonds between BI-2536 and ASN140 relies on carbonyl oxygen to provide hydrogen bond donors, while compound 15 with the ASN140 residue at the same position relies on N atoms to provide hydrogen bond acceptors, which is consistent with the results of the CoMSIA model that hydrogen bond acceptor provided at this position is beneficial to increase the inhibitory activity. It shows that the provision of hydrogen bond receptors here is a significant cause for the stable binding of the BRD4 target protein to a small-molecule ligand that exerts their inhibition ability.

In addition, since the R1 group does not give a hydrogen bond acceptor or donor capable of binding to amino acid residues, the highly active compound 20 and lowest compound 35 and with different R3 groups were selected dock active sites of the BRD4 protein. Figure 7e shows the binding map of compound 20 docked to BRD4 protein. The nitrogen atom on the triazole remains forms a hydrogen bond with the ASN140 residue and gives a higher score in total score, crash, polar are 8.794, − 1.3279, 1.0276, respectively. Figure 7f showed a molecular docking diagram of lowest compound 35 that acts as a ligand to provide hydrogen bond receptors for forming two hydrogen bonds with amino acid residues GLN85 and ASP88, which are given of total score, crash, polar is 5.9321, − 2.6832, 1.6461, respectively. According to the known compound activities and docking results, it seems that there is no need to form hydrogen bonds at the R3 group of the compound as the ligand, which is consistent with the docking result of BI-2536 as the ligand R78 202 in Fig. 1b. The increased terminal volume at R3 increases the movable range of the ligand molecule. While the R3 terminal group of compound 35 is smaller and the formation of hydrogen bonds makes it more rigid, which further rationalizes verifies the HQSAR results.

Structure–activity relationship analysis and summary

We have thoroughly discussed the structural requirements of 4,5-dihydro-[1, 2, 4] triazolo [4,3-f]pteridine derivatives as BRD4 protein inhibitors and provided newly design ideas for cancer treatment drugs by establishing different QSAR models. The QSAR contour model of compound 15 with the best inhibitory effect indicates that the rotatability of the group should be increased at the beginning of the R1 group to avoid large steric hindrance. Meanwhile, the end of the R1 group should be added with a negatively charged and bulky group. The methyl group at the R2 group or other bulky groups is necessary to improve the inhibitory activity and the benzene ring at R1 should also be added with a more electronegative amide group. Finally, the bulky group is considered to be connected at the end of the amide. Figure 8 summarizes the QSAR research on the potency structure relationship of inhibitor compounds, which provides effective help for the future design of new BRD4 inhibitor compounds.

Fig. 8
figure 8

Structural modification tips derived from QSAR studies

Design newly BRD4 inhibitors based on QSAR and molecular docking

Four representative novel compounds with theoretical inhibitory effects were designed to verify the results of the structure speculation based on the summarized structure–activity relationship. We do not change the methyl group at R2 that can significantly increase the inhibitory activity and retain the triazole group that provides hydrogen bond acceptor. Changing the group connected to the amide on the benzene ring of the R3 group to increase the bulkiness and its positive charge and simultaneously to change the oxygen on the amide to a more negatively charged S atom. In addition, the methyl group at the junction of the R1 group is retained, and the thiophene group at the end is replaced with other groups with stronger electronegativity. The designed novel compounds structure is shown in Table 8. The topomer CoMFA model and HQSAR model with the best predictive ability are used to predict their activities. The ideal results were further verified that the modified information through QSAR modeling testifies valid and reasonable. Nevertheless, the predictive power of the QSAR model and the inhibitory activity of designed compounds remains to be verified in vivo or in vitro.

Table 8 Structure and predicted activity of newly designed molecules

The docking situation of 4 newly designed compounds with theoretically good inhibitory effects is shown in Fig. 9. The T1 compound in Fig. 9a, T2 compound in Fig. 9b, T4 compound in Fig. 9d still rely on the nitrogen atom on the triazole to form a stable hydrogen bond with ASN140. The terminal of the interface does not interact within the acceptable range on account of the protomol has fewer amino acids or farther away. It is worth noting that the optimal conformation of compound T3 does not form a hydrogen bond through the N atom on the triazole, but instead forms hydrogen through the O atom at the end of the benzene ring amide group with the residue ILE146 through the inversion of the conformation, which is exposed to the protein without forming an interaction. The molecular docking results declare the importance of the amide group at the end of R3 and maintain a strong hydrogen bond in the first bromine domain to have a response for improving the inhibiting ability.

Fig. 9
figure 9

Docking results of newly designed molecules. a: Newly designed T1 compound as a ligand for optimal binding to BDR4 protein; b: Newly designed T2 compound as a ligand for optimal binding to BDR4 protein; c: Newly designed T3 compound as a ligand for optimal binding to BDR4 protein; d: Newly designed T4 compound as a ligand for optimal binding to BDR4 protein(magenta dotted line represents hydrogen bonds, the green sticks represent amino acid residues)

Conclusion

In this work, a series of 4,5-dihydro-[1, 2, 4] triazolo [4,3-f] pteridine derivatives that inhibit the overexpression of BRD4 protein is selected to establish QSAR statistical models. The contour map provides different effects in molecular activity by diverse substituents. The R1 group should be rotatable and bulky. The methyl group at R2 has an important effect on the increase in activity and R3 should be a negative and bulky group. Compared to the analysis of the BI-2536 molecular docking results applied to clinical drugs, the hydrogen bond receptor at the top of the framework should be considered during designing new inhibitor molecules. Furthermore, four new compounds were designed and predicted based on structural modifications with ideal docking results, further verifying the accuracy of the QSAR results. Therefore, these models can not only be used to predict this kind of compound inhibitory activities as BRD4 inhibitors, but also can guide the further design of BRD4 inhibitors.