Introduction

Many basic activities of cells, such as metabolic pathways or signal transduction, are carried out by series of associations and dissociations of biomolecules, especially proteins. Thus, deregulations of the network of protein–protein interactions (PPIs) often lead to cellular dysfunctions and to severe diseases, such as cancers or degenerative diseases [1]. Accordingly, targeting abnormal PPIs with modulator molecules offers an attractive opportunity to discover new therapeutic compounds [2]. It is noteworthy that, compared to strategies that target isolated enzymes or receptors, targeting PPIs reduces the probability of drug resistances, since a mutation in one protein should need a second mutation in its partner to maintain the PPI functional [3]. However, in contrast to substrate binding sites in enzymes or receptors, protein–protein interfaces are generally broad and flat, and often need large molecules to be disrupted [4,5,6]. For this reason, peptide derivatives are particularly well appropriate for efficiently modulating PPIs, given their intermediate size between small organic compounds and antibodies. Also, peptide derivatives have generally higher specificity for their target than small compounds, reducing the probability of undesirable side effects [7, 8]. Consequently, peptide derivatives have emerged as promising therapeutic avenues [9, 10], and, at the present time, a dozen of peptides targeting PPIs are currently in clinical trials [11, 12].

Nonetheless, the peptide approach remains quite under-exploited, mainly due to non-optimal pharmacokinetic properties inherent in peptides: they are easily degraded by the proteases, they have difficulties to pass physiological barriers, and they can induce undesirable immune responses [10, 13]. To overcome these limitations, it is recommended to reduce the peptidic nature of these molecules, for example by using non-natural amino acids or by cyclizing them. Particularly, in addition to a lower sensitivity to proteases and a higher membrane permeability, cyclic peptides have a more constrained conformation than their linear counterparts, which reduces the entropy cost of binding and improves the affinity to their target. It could be noted that cyclic peptide conformations generally have preferential orientations of the amino acid side chains. To design peptides with side chain orientations optimal for binding a targeted protein, various computational techniques can be used to predict the conformations of cyclic peptides depending on their cyclization method (stapling, head-to-tail, disulfide, side chain to side chain...) and on the insertion of non-natural amino acids (N-methylated, \(\alpha\),\(\alpha\)-disubstituted...) [14, 15].

That being said, the main challenge in the drug discovery process remains to find the sequence of the peptide derivatives that will bind a target with high affinity and high specificity. Among the experimental approaches, the display-based technologies [16, 17], especially the phage display screening [18, 19], are probably the most widely used for this purpose. Basically, the method consists in generating huge libraries of peptides displayed on phage capsids, in incubating these phages with proteins of interest attached to a surface, and in identifying those that can bind with high affinity the immobilized proteins. It is worthy to note that phage-displayed cyclic peptides can be obtained by forming a covalent bond between two cysteine residues or between a cysteine and an inserted non-canonical amino acid bearing an electrophilic reactive group [16, 20]. Despite the undeniable efficiency of phage display techniques to discover peptide binders of a protein, there is nonetheless no guarantee that the identified peptides bind the protein surface involved in the targeted PPI, especially for large proteins or those which interact with multiple partners. Furthermore, applying this technique to identify binders of amyloid protein aggregates seems to be tricky and no such application was reported in the literature, as far as we know. Thus, it still remains worthwhile to develop new methods for designing cyclic peptides targeting not only a protein but a specific protein surface.

When the three-dimensional structure of a targeted protein–protein complex is known, a traditional approach to identify a peptide hit is to isolate from the binding interface the short peptide segments that mostly contribute to the complex binding energy. Then, these so-called minimal recognition motifs, hot segments, or self-inhibitory elements are modified to optimize their affinity, specificity, and pharmacokinetic properties. In this regard, it is worthy to mention that several computational tools were developed to find the optimal linkers to perform the cyclization of minimal recognition motifs [21,22,23]. If pharmacophores of a minimal recognition motif can be identified, then virtual screening of various peptide libraries can be performed on these pharmacophores to discover peptide binders of the targeted protein [24]. To help in this task, several computational tools were developed to facilitate the generation of libraries of diverse peptides. For example, the Robetta server can easily generate libraries of helical, loop, or extended peptides [25]. Of particular interest is the program CycloPs which can simply generate large and diverse libraries of cyclic and constrained peptides from natural and commercially available non-natural amino acids [26].

When the three-dimensional structure of only one partner of a protein–protein complex is known, and as long as the binding interface can be inferred, de novo design methods of cyclic peptides could be advantageous. Among the recent research in this direction, one can mention the stochastic evolutionary algorithm proposed by Soler et al. [27] or the anchor extension strategy developed by Hosseinzadeh et al. [28]. Alternative approaches that have emerged these last years to design PPI modulators are the fragment-based methods. They consist in screening a library of molecular fragments according to their affinity for a targeted protein and selecting those with the best binding energies. The selected fragments are then linked to form one modulator molecule. These approaches are however tricky to be implemented experimentally because small molecular fragments generally have low affinities for the target which are difficult to be measured by standard experiments. To overcome the experimental limits, in silico fragment-based approaches have been developed since computational methods, such as molecular docking, can quantify protein-ligand interactions even if the affinity is very low.

Nevertheless, when targeting large protein surfaces, the molecular fragments should not be too small to have an appreciable binding specificity. Thus, only two types of libraries were successfully used in fragment-based design of PPI modulators, those composed of FDA-approved compounds or those constituted of natural substances [2]. However, the molecules yielded by these fragment libraries are far from being peptides and the chemical linking of already elaborated fragments can be tricky. In this context, we developed a novel in silico fragment-based approach to design peptidic PPI inhibitors. This method, called Des3PI (design of peptides targeting protein–protein interactions), performs docking calculations of a library of amino acids on a targeted protein surface and then links those with good binding energy in order to generate the sequence and structure of cyclic peptides which will likely bind the protein target with high affinity and specificity.

Methods

Building the fragment library

In this study, the fragment library is simply composed of the twenty natural proteinogenic amino acids. An initial three-dimensional structure for each of them was generated by using the 2D to 3D structure conversion program MarvinSketch 6.2.1 from ChemAxon [29]. Then, each amino acid structure was charged using the AM1-BCC model [30] and shortly minimized using 5 000 steepest descent steps and the Generalized AMBER Force Field (GAFF) [31]. It should be noted that all fragment amine and carboxyl groups were modeled in their neutral form, except the aspartate, glutamate, lysine, and arginine side chains which were considered in their ionic state.

Finding the fragment preferential binding positions

First, the targeted protein surface was delineated by centering and sizing an Autodock Vina search box [32] around the area of interest (Fig. 1A). Then, each fragment was docked 50 times onto the defined protein surface using Autodock Vina [33]. For each docking, the 9 best scores were retained, yielding 9 × 50 × 20 = 9000 binding modes of all the 20 amino acids on the protein surface.

Fig. 1
figure 1

Des3PI workflow: (A) The library of amino acids is docked 50 times onto the targeted protein surface delineated by a green rectangular box. (B) All the binding modes are clustered to determine the hotspot locations. (C) The most recurrent amino acids are identified for each hotspot. (D) The hotspots that are close from each other are linked with glycine residues. (E) Steps A, B, C, and D are repeated 20 times. (F) For each class of peptides, the amino acid occurrence in the generated sequences is calculated, and (G) the five most promising peptide sequences are output. Lowercase g indicates glycine linkers

Then, the positions of the \(\alpha\)-carbons of all the binding modes were clustered using a hierarchical algorithm based on the method of centroids [34]. A criterion of 3.5 Å for the minimal distance between two centroids was chosen, this threshold being slightly lower than the mean distance between two successive \(\alpha\)-carbons in proteins and peptides (3.8 Å) [35]. To avoid considering the sparsely populated clusters, those with less than 0.1% of the 9 000 binding modes were not further taken into account, yielding the most significantly populated clusters as the hotspots of the future modulator peptide (Fig. 1B). By default, this cutoff parameter was fixed to 0.1%, but it can be adjusted in order to have a manageable number of hotspots. For example, in the case of Mcl-1 protein, cutoff values of 0.1%, 0.5%, and 1% lead to 9, 8, and 6 hotspots, respectively (Fig. 3). Finally, the most frequently found amino acid in each cluster is selected to generate the sequence of the peptide that Des3PI considers as a good binder of the targeted protein surface (Fig. 1C).

Linking the hotspots into a cyclic peptide

Once the hotspots and their relative positions were determined, it is possible to visually choose those that are close enough to form one peptide and manually link them to generate a cyclic sequence (Fig. 1D). Alternatively, it is possible to use an algorithm that we developed to automatically perform these two tasks. This independent module that we integrated into Des3PI has been satisfactorily tested in several cases but may give some inaccurate or unexpected cyclic sequences when the hotspots do not have a clear cyclic geometry. This automated approach is presented below.

First, to identify the hotspots that are not too far from each other for forming one cyclic peptide, the positions of all previously found hotspots were clustered using the same method as above but with a criterion of 12.5 Å, which allows to separate groups of hotspots distant by more than 3 times the mean distance between two successive \(\alpha\)-carbons in proteins and peptides. Moreover, we considered that a group of hotspots can form a promising cyclic peptide if it is composed of at least 4 hotspots.

Then, for a given group of hotspots, we determined the cyclic peptide sequence as follows: (i) The average plane of the group of hotspots was determined by using a principal component analysis of their positions, and the projections of the hotspots on this plane were calculated. (ii) The most populated hotspot (which is arbitrarily defined as the first residue of the sequence) is chosen as the origin of a reference frame of the plane, whose the axes are the two first principal components previously found. (iii) In this reference frame, the other hotspots are ordered according to their polar angle, from the lowest to the largest one, in the interval [− 180°; 180°]. It should be noted that, in the case where two hotspots have close polar angle values (differing by less than 15°), the hotspot with the lowest radial distance is prioritized, except for the two last hotspots for which the highest radial distance is prioritized in order to close the peptide cycle.

Finally, having the sequence of hotspots in the designed peptide, Des3PI determines how many linkers are required to link two consecutive hotspots. In this study, we chose the glycine as a linker and we added between two consecutive hotspots a number of linkers equal to the integer part of their separating distance divided by 4.5 Å (Fig. 1D). This parameter was determined by using a trial and error approach on Mcl-1 protein: a too small value (below 4.0 Å) yielded too many glycine residues between hotspots and too large peptides. Conversely, a too large value (above 5.0 Å) led to too few linkers and too compact peptides. Subsequently, both too large and too compact peptides could not be correctly docked onto the targeted protein surface close to the hotspot positions determined by Des3PI (see the validation subsection).

Generating the most promising peptide sequences

At this point, it should be noted that, when repeating steps A to D, different peptides can be obtained because of the stochastic search algorithm implemented in Autodock Vina [33]. To provide sequence diversity, the whole protocol described above was repeated 20 times to generate 20 peptides (Fig. 1E). The latter were categorized into different classes according to the number of hotspots and their geometry (Fig. 1F). We considered here that two peptides have a similar geometry when the RMSD between their hotspots is below 1.75 Å. This parameter was fixed by using a trial and error approach on Ras protein: we tested different values from 1.0 to 2.0 Å and visually inspected whether similar hotspot geometries were effectively in the same class, and conversely, whether different geometries could be separated in different classes (Fig. 2). Once the peptide classes were defined, we output for each of them the amino acid occurrence at each position of the peptide sequences, which can be visualized using the PSSMSearch server [36]. Finally, a score is attributed to each amino acid proportional to its occurrence and the peptide sequences with the highest sum of these scores are considered as the most promising cyclic peptides for binding the targeted protein surface (Fig. 1G).

To complete the description of Des3PI, the numbers of runs (20) and of docking calculations (50) are shortly discussed hereinafter. Their impact on the sequences generated by Des3PI was assessed in the case of Mcl-1 protein (Fig. S1). This benchmark shows that, for 10 runs, the amino acid occurrences in the generated sequences slightly differ when the number of docking varies from 25 to 75. For 20 runs, the occurrences seem to converge for a number of docking larger than 50, and, for 30 runs, they are similar for all tested numbers of docking. From these tests, we decided to fix the default numbers of runs and docking calculations to 20 and 50, respectively.

Validation using blind docking

To validate the method, the peptides proposed by Des3PI have to be synthesized and their affinity and/or PPI inhibition activity have to be experimentally quantified. However, these experimental validations can be difficult and long to implement. Thus, we propose here a two-step procedure to computationally support whether or not the generated peptides are likely to succeed.

The first step consists in a blind docking of the best cyclic peptides designed by Des3PI on the targeted protein, and to verify whether they preferentially bind the targeted surface. It should be stressed here that, unlike the previous dockings of the single amino acids which were restricted to protein surfaces involved in protein–protein interfaces, the blind dockings of the designed peptides were performed on the entire proteins without specifying any targeted surface. Among the protein-peptide docking programs that could deal with cyclic flexible peptides, we chose AutoDock CrankPep (ADCP) [37, 38] which just requires to input the protein PDB file and the peptide sequence string. It is noteworthy that ADCP could yield different results depending on the first and last residues of the cyclic sequence given as input. Thus, for each peptide composed of n residues, we performed n docking calculations with different inputs of the first and last residues of the cyclic sequence. Each docking run consisted in 50 independent searches of 2,500,000 Monte Carlo steps, and generated 100 best binding modes. Finally, the 100 × n preferential binding modes of each peptide were analyzed by computing the root-mean-square deviation (RMSD) of the \(\alpha\)-carbons relative to the hotspots generated by Des3PI. Then we checked for each peptide whether one or several binding modes among the 5% best scores were found close to the targeted protein surface with a C\(\alpha\) RMSD relative to the Des3PI hotspots lower than 10 Å.

Checking complex stability using MD simulations

In a second step, we selected the protein-peptide complex with the lowest peptide RMSD (among the 5% best scores), and verify its stability by using molecular dynamics simulations performed with the GROMACS 2019.1 package [39]. The AMBER99SB-ILDN [40] and GAFF [31] force fields were used for the protein and peptide, respectively. Each complex was placed in a cubic box, so that the minimal distance between the solute and the cube faces was equal to 1 nm. Then the complex was solvated with TIP3P water molecules and neutralized with 0.15 mol/L of sodium chloride. The Lennard-Jones potentials were cut off at 1.2 nm and the Coulomb interactions were treated using the smooth PME method [41]. Each system was first minimized using 10 000 steps of the steepest descent method, then submitted to two short equilibration runs of 1 ns each, the first one to heat the system to 310 K using a Berendsen thermostat and the second one to equilibrate the pressure around 1 bar using the Parinello-Rahman method. After that, a 200 ns production run was performed in the isothermal-isobaric (NPT) ensemble using the Nose-Hoover and Parrinello-Rahman coupling algorithms [42,43,44] with the time constants \(\tau _T=0.5\) ps and \(\tau _P=2.5\) ps. The Newton’s equations of motion were integrated using the leap-frog algorithm with a time step of 2 fs, while keeping constant the length of all covalent bonds using the LINCS procedure [45]. MD trajectory frames were saved every 20 ps for subsequent analysis. Notably, contact residues were computed using the GROMACS gmx mindist tool and a cutoff value of 0.5 nm.

Results and discussion

Peptides generated by Des3PI

Des3PI was first applied to identify cyclic peptides targeting three proteins which are involved in three different types of protein–protein interfaces: the protein Ras which binds Raf via an \(\alpha\)-helix and a \(\beta\)-strand, Mcl-1 which interacts with the \(\alpha\)-helical BH3 motif of PUMA, and a protofibril of A\(\beta\) which mainly involves \(\beta\)-strand/\(\beta\)-strand interactions.

The three-dimensional structure of Ras was taken from a crystallographic structure of Ras-Raf complex (PDB ID: 3KUD [46]). The 20 runs of Des3PI on Ras generated peptides with either 4, 5, or 6 hotspots. The 20 peptides could be categorized in four classes whose amino acid occurrences are displayed in Fig. 2. From the occurrences, we output the five best peptide sequences in each class. When comparing the positions and amino acid compositions of class IV hotspots with the Raf residues in contact with Ras, hotspots 2 and 6 which are mainly populated with Arg are located at the same positions as two Lys residues. Furthermore, hotspot 3 which is essentially a Val is retrieved at the same location as a Raf Val residue. The three other hotspots are not clearly related to the RAF residues observed around their positions. Overall, half the six hotspots are composed of amino acids with similar properties as the Raf residues at the same locations.

Fig. 2
figure 2

Des3PI generated 4 classes of peptides potentially binding Ras protein. The five best peptide sequences of each class were generated according to the amino acid occurrences in the peptide hotspots

In experimental efforts to discover inhibitors of the oncogenic K-Ras proteins, Wu et al. performed a screening of about 3 millions cyclic peptides against the K-Ras G12V mutant and identified 20 sequences that can bind K-Ras with submicromolar affinity and disrupt its interactions with Raf [47]. Interestingly, these identified sequences are rich in aromatic residues and Arg, similarly to the sequences output by Des3PI. Notably, their most promising cyclic peptide (compound 12) has the sequence dNle-Fpa-Arg-dNal-Arg-Arg, where dNle is a D-norleucine, Fpa a fluorophenylalanine, and dNal a D-2-naphthylalanine [47]. The similarity in composition of their sequence with those of Des3PI Class IV peptides suggests that our computational approach generated relevant peptide binders of Ras.

For Mcl-1, we applied Des3PI on a three-dimensional structure extracted from the first model of the NMR structure of Mcl-1 in complex with PUMA (PDB ID: 2ROC [48]). In one run, Des3PI only found 3 hotspots, and in the other 19, the algorithm generated 6 hotspots (representing at least 1% of the 9 000 binding modes of the 20 amino acids, instead of the default threshold of 0.1%). In the latter case, the hotspots have a very similar geometry and can be grouped into one class of peptides. However, although the sixth hotspot was close enough to the other five to form a peptide (according to the clustering criterion of 12.5 Å), it led to a non obvious cyclic geometry (Fig. 3). Therefore, we decided to manually remove this hotspot and only keep the remaining five that were able to form a cyclic peptide. The amino acid occurrences in the 19 peptides and the derived five best peptide sequences are reported in Fig. 3. It could be noted that these five hotspots only partially occupy the Mcl-1 binding groove which normally accommodate the PUMA \(\alpha\)-helix. More specifically, referring to the pocket nomenclature of Mcl-1 binding groove by Denis et al. [49], the Des3PI hotspots are located in the hydrophobic pockets P2 and P3. It is therefore not surprising that the amino acids most frequently found at hotspots 2, 3, 4, and 5 are mainly hydrophobic ones. An Asn residue is always found at hotspot 1, close to the position of a Puma Arg residue (which makes an intramolecular salt bridge with an Asp). Overall, the amino acids frequently found at these five hotspots are consistent with those of Puma involved in binding Mcl-1.

Fig. 3
figure 3

Des3PI generated one class of peptides potentially binding Mcl-1 protein. The five best peptide sequences were generated according to the amino acid occurrences in the peptide hotspots

Among the known inhibitors of Mcl-1, many are small organic compounds with a central indolic, heterocyclic, or aromatic scaffold which occupies the P2 pocket, another hydrophobic group connected to the central scaffold which occupies the P3 pocket, and a carboxylic acid group which interacts with Mcl-1 Arg263 [49]. Alternatively, peptide inhibitors of Mcl-1 have been identified by screening BH3-based libraries [50, 51]. It was observed that these helical peptide binders had very similar sequences to natural BH3 helix ones, with four hydrophobic residues on one side of the \(\alpha\)-helix which occupy the four Mcl-1 binding pockets, and one Asp residue between the third and fourth ones which makes a salt bridge with Mcl-1 Arg263. In comparison, Des3PI also found four hydrophobic hotspots but they are not aligned as those in BH3-like helices and only occupy two over the four Mcl-1 binding pockets (P2 and P3). Moreover, we did not retrieved an Asp residue close to Mcl-1 Arg263. Instead, Des3PI output an Asn at hotspot 1 which could easily make a hydrogen bond with it. Overall, the peptides designed by Des3PI have different topology from BH3-based \(\alpha\)-helices, but might tightly occupy half of the Mcl-1 binding groove.

Regarding the design of peptides targeting A\(\beta\) protofibril, the results provided by Des3PI are more diverse than for Ras and Mcl-1, due to a larger area of the targeted surface. The latter is the surface perpendicular to the principal axis of the dimeric S-shaped protofibril of A\(\beta\) resolved by solid-state NMR (PDB ID: 5KK3 [52]). We extracted from the PDB structure the inner 2 × 5 A\(\beta\) molecules of the protofibril and applied Des3PI to the axial surface composed of the chains C and L. In several runs, Des3PI was able to find 2 or 3 groups of hotspots close enough to form cyclic peptides. Over the 20 runs, Des3PI identified three different areas that could be potential peptide binding sites (Fig. 4). The first one was systematically retrieved in the 20 runs, and Des3PI provided here one class of cyclic peptides with 4 hotspots. The second area was identified 17 times over 20 runs, and, here also, Des3PI found only one class of cyclic peptides with 4 hotspots. The third area was retrieved 11 times over 20 runs, but our algorithm generated here 4 different classes of cyclic peptides, all of them having 4 hotspots except the last one which has 5 (Fig. 4). All together, we could provide 12 different peptide sequences that potentially bind 3 different areas of the surface perpendicular to the A\(\beta\) protofibril axis.

Fig. 4
figure 4

Des3PI generated six classes of peptides potentially binding A\(\beta\) protofibril. For each class, at most five best peptide sequences were generated according to the amino acid occurrences in the peptide hotspots

The nature of the amino acids which most frequently occur in Des3PI hotspots are generally consistent with the A\(\beta\) residues at the protein–protein interface. In class I peptides, an Ile is mainly encountered at hotspot 2 which is located at the same position as an A\(\beta\) Ile residue. At hotspots 3 and 4, two Ser were found close to two A\(\beta\) Gly and one His. Lastly, an Arg is always found at hotspot 1 close to A\(\beta\) Glu and His residues. In class II peptides, hydrophobic amino acids are always found at the hotspots 1, 2, and 4 which are located in the area of four A\(\beta\) hydrophobic residues (two Phe, one Ala, and one Val). The Thr found in hotspot 3 is situated at an A\(\beta\) Val residue position. Similarly, the class VI peptides have four hotspots composed of hydrophobic amino acids and positioned in the vicinity of two Phe, one Ala, and one Val. A Gly is always retrieved in the hotspot 4 which is situated at the place of an A\(\beta\) Lys residue (Fig. 4). Overall, except for class I peptides which rather bind the A\(\beta\) C-terminal segment, Des3PI mainly generated hydrophobic peptides which target the \(^{18}\)VFFA\(^{21}\) central region.

This observation is an encouraging outcome of our computational approach since the A\(\beta\) self-recognition element \(^{16}\)KLVFFA\(^{21}\) is a major target of A\(\beta\) aggregation inhibitors. Naturally, many of them are peptides or peptidomimetics designed from this sequence [53], including cyclic peptides [54,55,56]. Nonetheless, high throughput screening approaches allows to identify peptide inhibitors with more diverse sequences than the self-recognition one. For instance, Richman et al. synthesized a library of head-to-tail cyclic D,L-\(\alpha\)-hexapeptides and identified among them the two sequences lLwHsK and sHwHsK (where lower and upper case letters denote D- and L-amino acids, respectively) which can inhibit A\(\beta\) aggregation [57]. In another study, Wang et al. performed an in silico screening of amyloidogenic hexapeptide databases to find those which are prone to dimerize into a \(\beta\)-sheet. Among 11 identified ones, 6 hexapeptides exhibited strong binding affinity to A\(\beta\) in SPR experiments, and among them, the two sequences CTRIYWG and GTVWWG could strongly inhibit A\(\beta\) aggregation in ThT fluorescence assays [58]. It could be noted that these two experimental studies revealed peptides with amino acids (Ile, Leu, Thr, Trp, and Tyr) similar to those which frequently appear in Des3PI sequences. This suggests that our approach could design potential good peptide binders and inhibitors of A\(\beta\) oligomers.

Validation of Des3PI peptides by blind docking

To validate the method, we set up a two-step computational procedure to check whether or not the generated peptides are likely to succeed. First, the selected peptides were blindly docked on the protein target by using the ADCP program [37, 38]. We considered that a peptide passes this test if at least one binding mode among the 5% lowest scores is retrieved close to the Des3PI hotspots with a RMSD of the peptide C\(\alpha\) atoms lower than 10 Å. Secondly, we selected the protein-peptide complex with the lowest peptide RMSD (among the 5% best scores), and verify its stability by molecular dynamics simulations.

For Ras protein, the graphs displaying the ADCP scores versus RMSD of the docked peptides (Figs. 5 and 6) show that all 20 peptides generated by Des3PI have at least one low energy binding mode close to the targeted surface. It should be noted that the peptides with only 4 hotspots (class II) have overall moderate binding energies, as might be expected. However, although the peptides with 6 hotspots (class IV) have binding energies among the lowest ones, they did not outperform those of class I which have only 5. This indicates that the presence of specific amino acids in the peptides is more important for the binding than their size. For the second step of the validation procedure, we could have checked the stability of all the 20 peptides in complex with Ras, but, because of our limited computational resources, we chose to submit only one representative peptide of each class to the MD simulation step (QRAWR, NWAR, DVWGR, and DRVWAW).

Fig. 5
figure 5

ADCP score of the binding modes on Ras protein of the 20 best peptides (class I and II) generated by Des3PI as a function of their RMSD relative to the hotspot positions

Fig. 6
figure 6

ADCP score of the binding modes on Ras protein of the 20 best peptides (class III and IV) generated by Des3PI as a function of their RMSD relative to the hotspot positions

In contrast, the blind docking of the peptides generated by Des3PI for Mcl-1 is more ambivalent than for Ras (Fig. 7). Among the five selected peptides, only NFWIW clearly has low energy binding modes in the targeted surface of Mcl-1. Nevertheless, for each of the two peptides NFFKW and NWFIW, one binding mode was found at the boundary of the criteria for validating the test. Compared to Ras protein, this mitigated success for Mcl-1 might be due to the shape of its binding interface with PUMA BH3 \(\alpha\)-helix which is longer and narrower than the rather flat targeted surface of Ras. This might explain not only the lower number of peptide classes found by Des3PI for Mcl-1 (Fig. 3) when compared to Ras (Fig. 2) but also the medium success rate of the peptide blind docking on Mcl-1, given the fact that the designed peptides are cyclic, rather plane and not helical. Despite this, we continued the validation procedure of the 3 mentioned peptides (NFWIW, NFFKW, and NWFIW) by checking the stability of their complex with Mcl-1 using MD simulations.

Fig. 7
figure 7

ADCP score of the binding modes on Mcl-1 protein of the 5 best peptides generated by Des3PI as a function of their RMSD relative to the hotspot positions

Regarding the 12 peptides designed by Des3PI for targeting A\(\beta\), the results of their blind docking (Fig. 8) show that (i) none of the 5 peptides of class I could be successfully docked close to the Des3PI hotspots. Nevertheless, the peptide RISS had one binding mode at the boundary of the test criteria and was further considered in the second validation step; (ii) each of the two sequences LFTW and LWTW of class II has one binding mode satisfying the criteria for validating the test; and (iii) 3 over the 5 last peptides generated by Des3PI (WYGK, WYGW, and WYIG) were successfully docked to their target. Given the very large area of the A\(\beta\) protofibril targeted surface, we estimate that the success rate of 6 peptides over 12 is rather encouraging. These 6 peptides were further evaluated by submitting their complexes with A\(\beta\) protofibril to MD simulations.

Fig. 8
figure 8

ADCP score of the binding modes on A\(\beta\) protofibril of the 12 best peptides generated by Des3PI as a function of their RMSD relative to the hotspot positions

All together, over the 37 peptides targeting Ras, Mcl-1, or A\(\beta\), 29 of them could be successfully docked onto the protein targeted surfaces. This satisfactory success rate (78%) attests that Des3PI can generate cyclic peptide sequences with high probabilities to bind a targeted protein interface.

Checking protein-peptide complex stability by MD simulations

The best binding mode (i.e. that one with the lowest RMSD with respect to the Des3PI hotspots among the 5% best scores) of each of the 4 peptides NWAR, DVWGR, QRAWR, and DRVWAW on Ras protein was used as the starting conformation for two independent MD simulations. In all 8 simulations, the Ras protein RMSD relative to its initial conformation are stabilized between 1 and 2 Å (Fig. S2). Overall, the 4 peptides remain on the targeted surface of Ras (Fig. 9), except in one simulation of peptide NWAR (class II) and, in a lesser extent, in one simulation of DVWGR (class III), corroborating the moderate binding energies output from their docking calculations (Figs. 5 and 6). The 2 peptides QRAWR (class I) and DRVWAW (class IV) have the most stable positions on the protein surface. Notably, the 6 hotspots DRVWAW peptide largely occupies the Ras surface that is involved in the binding to Raf and appears to be the most promising potent inhibitor of Ras-Raf interactions.

Fig. 9
figure 9

Time evolution of the peptide C\(\alpha\) RMSD relative to the Des3PI hotspots in the Ras-peptide complex MD simulations. Snapshots in the left and right columns represent the complex initial and final structures, respectively. Green patches on protein surface indicate Ras residues in contact with Raf (PDB ID: 3KUD [46])

Regarding Mcl-1, all MD simulations of its complexes with the 3 peptides that were successfully docked show that the ligand positions in the protein binding cavity are stable during 200 ns (Fig. 10). It is interesting to note that, given the rather plane shape of these cyclic peptides which does not fit well the rather long and narrow binding site of Mcl-1, we expected that their binding to the protein would not be very stable. However, our MD simulations indicate the opposite tendency which can be accounted for by the fact that Mcl-1 can distort to well accommodate the cyclic peptides [59]. Indeed, as shown in Fig. S3, the protein RMSD increased to larger values (between 3 and 4 Å) than those of Ras which does not need to deform to bind the cyclic peptides. At the end, the 3 peptides NFWIW, NFFKW, and NWFIW successfully passed the two-step assessment.

Fig. 10
figure 10

Time evolution of the peptide C\(\alpha\) RMSD relative to the Des3PI hotspots in the MD simulations of their complexes with Mcl-1. Snapshots in the left and right columns represent the complex initial and final structures, respectively. Green patches on protein surface indicate Mcl-1 residues in contact with PUMA (PDB ID: 2ROC [48])

In the MD simulations of the 6 peptides bound to A\(\beta\), the protofibril RMSD stabilized at higher values (between 3 and 7 Å) than those of the globular proteins Ras and Mcl-1 (Fig. S4). Regarding the peptides, the following observations can be made (Fig. 11): (i) the RISS peptide unbound the A\(\beta\) protofibril in one of its simulations, crossed the simulation box, and bound the opposite surface of the protofibril (figure not shown). In the second simulation, the peptide remained overall attached to the targeted surface but transiently unbound the protofibril, indicating that this peptide is not tightly held in place; (ii) the peptides LFTW and LWTW of class II remained attached and quite close to the protofibril targeted surface, even if translations away from their initial position could be observed in half of their simulations. It should be noted that these translations may led these peptides to interact with the A\(\beta\) residues symmetrical to those initially contacted; (iii) the 3 last assessed peptides also remained bound to the targeted surface of A\(\beta\) protofibril. Nevertheless, the peptides WYGW and WYIG also moved away from their putative binding site in half of their simulations, contrary to WYGK which firmly stayed around its targeted surface in both its simulations.

Fig. 11
figure 11

Time evolution of the peptide C\(\alpha\) RMSD relative to the Des3PI hotspots in the MD simulations of their complexes with A\(\beta\) protofibril. Snapshots in the left and right columns represent the complex initial and final structures, respectively. Green patches on protofibril surface indicate A\(\beta\) residues initially in contact with docked peptides or residues symmetrical to the former ones

To sum up, three quarters of the peptides targeting Ras, three over three targeting Mcl-1, and two thirds of those targeting A\(\beta\) were shown to form steady dynamic complexes with their targeted protein. This success rate of 77% suggests again that the peptides designed by Des3PI have good chances to bind in a stable way a targeted protein surface.

Analyzing the targeted protein contact residues

For each of the three targeted proteins, the most promising inhibitory peptides exhibit, after blind docking and MD simulations, RMSD relative to the Des3PI hotspots between 0.4 and 1.1 nm. These values which can be considered as significant indicate that the peptide binding modes observed in simulations are not exactly those expected by Des3PI and reflect some conformational changes, global rotations, and/or translations of the peptides on the protein surface. Nevertheless, as illustrated in Figs. 9 and 10, in most of the simulated protein-peptide complexes, the ligand steadily occupied a large part of the interface area and might therefore competitively inhibit the protein partner binding. To support this assumption, we compared the protein residues that are contacted by the peptides during simulations with those that are in contact with the partners in experimental complex structures.

Regarding Ras protein, over the 10 residues of Ras involved in the interface with Raf, 7 residues (Ile21, Gln25, Ile36, Glu37, Asp38, Ser39, and Tyr40) are contacted by the Des3PI peptides during an extensive part of all the MD trajectories, except in the first MD of Ras-NWAR complex (Fig. 12). This indicates that the three peptides DVWGR, DRVWAW, and QRAWR steady occupy a large portion of the Ras-Raf interface and thus might be potent inhibitors of this complex.

Fig. 12
figure 12

Percentage of the MD trajectory times for which Ras residues are contacted by the inhibitory peptides (black bars). Red bars indicate Ras residues in contact with Raf in the 3KUD structure [46])

In the six MD simulations of Mcl-1 complexes, the peptides could not occupy the whole binding groove which accommodates the PUMA \(\alpha\)-helix in the 2ROC structure [48]). Indeed, the four residues Phe299, Phe300, Val302, and Gln303, which are located at one extremity of the groove (right side of the binding site colored in green in Fig. 10), are never contacted by the three simulated peptides (data not shown). Nevertheless, over the 18 residues that constitute the rest of the Mcl-1 interface with PUMA, at least 10 residues (His205, Ala208, Phe209, Met212, Lys215, Val230, His233, Val234, Thr247, and Phe251) are in contact with the peptides during a large part of the MD trajectories (Fig. 13). By stably occupying about half of the Mcl-1 binding groove, the three designed peptides might be considered as promising inhibitors of Mcl-1 interactions with PUMA.

Fig. 13
figure 13

Percentage of the MD trajectory times for which Mcl-1 residues are contacted by the inhibitory peptides (black bars). Red bars indicate Mcl-1 residues in contact with PUMA in the 2ROC structure [48])

About A\(\beta\) protofibrils, the targeted binding surface is composed of the residues 11–42 of two chains and is clearly too large to be entirely occupied by the six peptides designed with Des3PI. That is why we compared here the A\(\beta\) most frequently contacted residues in simulations with those from the docking calculations to check the stability of the binding modes of the docked peptides. As displayed in Fig. 14, the first simulated peptide, RISS, could not remain in the same location as the one predicted by docking, confirming that this binding mode is not stable. About the two peptides LFTW and LWTW, Fig. 14 shows that, overall, they both kept their docking position on the A\(\beta\) protofibril surface during the simulations. It should be noted that, in the first MD simulation, LFTW slightly shifted to region 25–34 of the A\(\beta\) first chain, while it significantly moved toward the symmetrical residues 14–20 and 30–34 of the A\(\beta\) second chain, in the second trajectory. Regarding the peptides of the last class, we can observe a similar dynamic behavior for the two peptides WYGK and WYIG, which, overall, remained in the same area as their docking position, but slightly moved toward the A\(\beta\) region 36–41 during their simulations (Fig. 14). In contrast, WYGW is more mobile and can span across the binding surface to reach the symmetrical A\(\beta\) chain. Altogether, these analyses suggest that the four peptides LFTW, LWTW, WYGK, and WYIG are good binders of the A\(\beta\) protofibril targeted surface and might be potential good inhibitors of A\(\beta\) aggregation.

Fig. 14
figure 14

Percentage of the MD trajectory times for which A\(\beta\) residues are contacted by the inhibitory peptides (black bars). Red bars indicate here A\(\beta\) residues in contact with each peptide after the blind docking. Orange bars indicate the A\(\beta\) residues symmetrical to the previous ones

Peptide pharmacological properties

As highlighted by Vinogradov et al., a major challenge in cyclic peptide design remains to optimize their pharmacological properties [10]. We think that this task is beyond the scope of this study which mainly aims at finding peptide sequences with potential high affinity for a targeted protein surface. Nevertheless, we would like to briefly discuss here some pharmacological properties of the most promising peptides found by DesPI (Table 1).

Table 1 Some pharmacological properties of the most promising Des3PI peptides

Unsurprisingly, the cyclic peptide molecular weights exceed the Lipinski’s threshold of 500 Da [61], and, due to the large number of backbone hydrogen bond donors and acceptors, their polar surface areas are much larger than the 140 Å\(^2\) criteria of Veber et al. [62]. It could also be noted that, despite the presence of many hydrophobic side chains in Des3PI peptides, they are globally rather hydrophilic as indicated by their negative octanol-water partition coefficient LogP. All together, these data suggest that our peptides could hardly diffuse through biological membranes and would have a poor oral bioavailability. Nevertheless, beyond the fact that many drugs do not satisfy the rule of 5 [63], it is possible to improve the peptide passive diffusion through biological membranes by reducing the number of backbone hydrogen bond donors. This could be achieved by using N-methylated amino acids, such as in the orally bioavailable immunosuppressant cyclosporine, a cyclic 11-residue peptide containing seven N-methylated amino acids [64].

Finally, it should be noted that SwissADME [60] predicts that our peptides are between poorly and moderately soluble in water, as indicated by their LogS values between − 10 and − 4. This is probably due to the presence of many hydrophobic and aromatic side chains in Des3PI peptides. Unfortunately, improving the peptide solubility by modifying these side chains would probably lead to a deterioration of both their affinity for the protein and their membrane permeability. As for about 40% of the approved drugs, strategies to administer poorly soluble compounds remain to use drug delivery systems, such as emulsion, liposome, or polymer encapsulation [65].

Computation times

Before concluding, we would like to give an idea about the time needed for finding new peptide binders of a targeted protein with Des3PI (Table 2). The generation of the peptide hotspots and sequences per se is a process that takes less than a day, at the end of which the user is provided with several peptide candidates which can be synthesized and tested experimentally. An increased confidence in Des3PI results can be achieved by running protein-peptide blind docking calculations, at the cost of a few extra computation hours. Naturally, the protein-peptide complex stability assessment by MD simulations is by far the most time consuming process, but we consider this step as optional if experimental studies can be envisaged in a near future.

Table 2 Computation time of each step of Des3PI approach for the three proteins studied herein

Conclusion

Given a protein surface involved in a protein–protein interaction that we want to perturb with a cyclic peptide, Des3PI aims at identifying sequences that potentially bind the targeted surface with high affinity. In this report, the principle and algorithm of the fragment-based approach implemented in Des3PI were described in detail in the Methods section. Des3PI was applied to three different protein interfaces, one composed of \(\alpha\)-helices (Mcl-1), a second one of \(\beta\)-strands (A\(\beta\) protofibrils), and a third one comprising both of them (Ras). For each of these targets, Des3PI was able to provide at least five different peptide sequences with potential high affinity for the proteins. For large and flat protein surfaces, such as Ras or A\(\beta\) protofibril, Des3PI can even generate a dozen or more peptide hits.

The ability of the peptides designed by Des3PI to correctly and stably bind their targeted protein surfaces were tested by a two-step validation protocol consisting in a blind docking of the peptides onto the targeted proteins, followed by stability tests of the complexes found among the 5% best scores and with the lowest peptide RMSD with respect to Des3PI hotspots. For Ras, all the 20 peptides provided by Des3PI were successfully docked, and 4 representative ones in complex with Ras were submitted to MD simulations. Three of them exhibited a good stability of their binding to Ras targeted surface. Among the 5 peptides that Des3PI yielded for Mcl-1, three successfully passed both the blind docking and stability test. Finally, when targeting A\(\beta\) protofibril, Des3PI generated 12 different peptides, half of them were successfully docked, and 4 over 6 exhibited steady binding to the targeted surface perpendicular to the protofibril axis.

At the end, the overall success rate of the two-step validation procedure is about 60% for the three protein targets studied herein. This encouraging result suggests that our peptide design program Des3PI is a reliable tool to identify in the immense space of peptide sequences those which are most likely to bind a protein surface target. Of course, these identified peptides require to be synthesized and tested in vitro to fully validate this approach. These experimental studies have been initiated for the peptides targeting Mcl-1 and A\(\beta\) protofibril, and are in progress.